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I.  I N1ECDUCTIG  N 


A.  A  IfcESIS  OVERVIES 


The  scope  of  tnis  thesis  is  twofold.  The  first  is  to 
provide  a  methodology  tc  use  in  the  performar.ee  measurement 
cf  a  catalase  computer.  Ihe  second  is  the  application  cf 
tnis  methodology  tc  a  specific  database  system  and  the 
attempt  tc  verify  the  performance  and  capacity  claims  cf  the 
target  system. 

Ihe  database  system  being  evaluated  is  an  experimental 
mult  i -b  acke  rd  database  system  known  as  MDBS.  The  basic 
design  goal  of  MDBS  is  to  develop  an  arcnitecture  which 
spreads  the  work  of  the  database  management  among  multiple 
backends.  MDBS  makes  two  basic  claims  in  its  design.  The 
first  is  that  by  increasing  the  number  cf  backends  used  as  a 
part  cf  the  database  computer  and  by  keeping  the  size  cf  the 
database  constant,  the  response  time  of  the  same  trans¬ 
actions  is  proportionally  decreased.  Tne  second  claim  is 


that  by  increasing  the  number  or  rackends  ar.d  also 
increasing  the  size  of  the  database,  the  response  time 
remains  relatively  constant . 

Ic  ccnduct  the  performance  measurement  of  MDBS,  various 
checkpoints  and  data  collections  are  incorporated  into  the 
system.  Although  all  checkpoints  and  data  collections  are 
selected  tc  provide  the  greatest  amount  of  useful  informa¬ 
tion  and  to  incur  the  least  amount  of  overhead,  seme  over¬ 
head  is  unavoidable.  A  quantitative  method  for  measuring 
the  overhead  incurred  is  therefore  provided.  The  perform¬ 
ance  results  of  MDBS  are  then  accurately  adjusted  using  the 
overhead  calculation.  In  this  way,  a  truly  accurate  meas¬ 
urement  cf  the  system  may  be  obtained. 


the  number 


backends 


As  a  a  ethodcio  g  y ,  the  thesis  descnces  the  stratc-ies 
and  iccaticr.s  of  the  checkpoint  p xacesent,  tnt  kinds  of  data 
on  r er icrmacce  collected,  the  ways  in  wnich  the  perfcrianct 
lease  re  icent  were  conducted  and  tne  interpretation  or  the 
results-  M,aybe  of  greatest  importance  is  the  ability  to 
calculate  actual  measurement  overhead  allowing  for  tne  pres¬ 
entation  of  truly  accurate  results. 

In  tnis  thesis,  we  will  focus  our  attention  on  the 
response  tine  of  the  work  being  done  by  the  database  system. 
We  will  net  focus  on  the  throughput.  Whereas  the  throughput 
is  defined  as  the  average  number  of  user  reguests  executed 
by  the  system  in  a  second,  the  response  time  of  a  reguest  is 
the  time  between  the  initial  issuance  of  the  reguest  by  a 
user  and  tie  fiDal  receipt  of  the  entire  response  set  of 
this  request  by  the  user  £  Bef .  1].  Since  the  majority  of 
the  reguests  processed  by  a  database  system  are  reguests  icr 
the  retrieval  of  information,  another  limitation  is  made  to 
the  scope  of  this  thesis.  We  will  focus  on  the  performance 
measurement  of  the  response  time  of  retrieval  reguests  in 
MDBS.  hopefully,  tnese  evaluations  will  verify  the  claims 
of  KE3S  and  also  provide  a  general  methodology  for  the 
performance  measurement  of  any  database  system. 

E.  I  EE  CBG  ANIZATION  CF  THE  THESIS 

This  thesis  is  organized  into  six  additional  chapters 
beyond  this  overview.  Chapter  II  describes  our  performance 
measurement  methodology  for  database  systems.  It  initially 
discusses  the  need  fer  such  a  metnodology  and  continues  with 
a  separate  discussion  of  both  the  internal  and  external 
performance  measurements.  The  chapter  then  culminates  with 
a  discussion  of  tne  combination  of  the  two  performance  meas¬ 
urements,  thus  providing  the  methodology  to  calculate  and 
adjust  fer  internal  performance  measurement  overhead. 


Charter  III  presents  an  overview  or  the  target  system, 
MDBS,  used  to  apply  the  performance  measurement  methodology. 
A  general  discussion  is  giver,  or.  the  attn bute-based  data 
model,  the  directory  tables,  the  process  structure,  the 
message  types,  ar.d  the  execution  or  a  retrieve  request. 

The  application  cf  ti.e  performance  measurement  metred- 
ciogy  tc  the  target  system,  MDBS,  is  presented  in  Chapter 
IV.  The  required  modifications  to  t  ne  MDBS  software  r.eeued 
to  perform  the  measurements  is  discussed,  along  witr.  a 
discussion  of  the  modifications  to  the  test  environment 
reguired  to  control  the  measurement  results.  A  description 
cr  the  additional  software  used  ror  both  inter-computer  ar.d 
inter-pr ccess  message  processing  measurements  is  also 
provided . 

Chapter  V  presents  the  construction  of  the  test  database 
and  the  selection  of  the  requests  used  in  the  performance 
measurements.  In  this  chapter,  the  design  of  the  desired 
test  database  is  first  discussed.  Due  to  system 
constraints,  only  a  subset  of  this  design  is  used  for 
testing  purposes.  The  chapter  concludes  with  an  analysis  of 
the  requests  used  in  the  performance  measurement. 

All  the  thesis  werk  is  brought  together  in  Chapter  VI 
with  the  presentation  of  the  performance  measurement 
results.  Since  the  goals  cf  this  thesis  are  to  verify  the 
performance  and  capacity  claims  of  MDBS  and  to  provide  a 
methodology  for  the  performance  measurement  of  a  database 
system,  only  the  tests  needed  to  obtain  these  goals  are 
performed.  In  the  chapter,  results  are  provided  for  the 
external  and  internal  performance  measu  remer.ts ,  and  the 
results  cf  the  message  processing  measurements. 

The  thesis  ends  with  conclusions  in  Chapter  VII  wr.ich 
can  he  made  from  the  results.  It  provides  a  summation  for 
the  entire  thesis  acd  offers  suggestions  in  future  work 
which  needs  to  be  dene  both  with  tne  methodology  and  with 
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II-  IZEFOBMA NCE  BIASORE HENT  METHODOLOGY  FOR  DATABASE 

S YSTEMS 


In  this  cr.apter,  we  present  a  ..ct f or sar.ce  measurement 
met  no Jc ic gy  for  dataiase  systems.  Tne  methodology  requires 
t.ie  ccllection  of  Loth  internal  ana  external  performance 
Btas are  lent s .  The  internal  performance  acasuresent  method¬ 
ology  is  the  collection  of  metnods  and  tools  which  will 
enable  a  tetter  understand! n g  of  tne  target  system  tv  meas¬ 
uring  certain  caggab  ilit  ies  or  that  system.  In  measuring 
Certain  capabilities  cf  tne  system,  we  focus  on  the  measure- 
ment  cf  time  srent  in  individual  processes  of  t.,t  target 
system.  The  external  performance  measurement  .met  hcdci  c  gy  is 
tne  collection  cf  methods  and  tools  which  will  enable  the 
tetter  understanding  cf  the  target  system  by  measuring  the 
sys t e jr  as  a  whole.  In  measuring  the  system  as  a  whole,  we 
focus  cr.  the  measurement  of  tne  response  time  of  the  target 
system.  Tne  response  tine  in  a  iatanase  system  is  defines 
in  [Ref.  1  ]  as  the  time  between  tne  initial  issuance  of  tne 
reguest  ty  a  user  and  the  final  receipt  of  the  entire 
response  set  of  this  reguest  ty  the  user. 

In  the  rest  cf  tnis  cnapter,  we  begin  by  examining  the 
need  for  a  dataoase  system  and  the  subseguent  need  to 
measure  the  performance  of  the  system.  tie  then  discuss  a 
general  perroraance  measurement  methodology,  addressing  ncth 
internal  and  external  performance:  measurement  as  separate 
issues.  Finally,  we  conclude  the  chapter  with  a  discussion. 


A. 


TEE  NEED 


The  need  for  a  catalase  car.  Lest  re  shows  as  ccrne- 
s  p  o  n  c  i  n  g  tc  the  need  for  inf  creation.  A  database  is  a 
repository  for  the  storage  of  information  on  a  computer,  any 
item  cr  combination  cf  items  of  which  can  be  easily  accessed 
in  a  relatively  short  timeframe.  A  businessman  may  desire 
ail  the  latest  pieces  of  information  to  make  a  management 
decision.  The  combat  field  commander  may  desire  complete, 
up- t c- the-minu t e  reports  to  arrive  at  a  tactical  decision. 

Eut  there  are  performance  and  capacity  problems  that 
must  be  overcome  in  providing  tnis  information.  As  an  ever 
increasing  amount  of  information  is  stored  in  a  database, 
the  response  time  cf  the  database  system  increases  notice¬ 
ably.  In  addition  tc  the  increase  in  the  size  of  the  data¬ 
base,  there  is  the  effect  cf  increasing  tne  number  cf  users 
accessing  the  system  and  the  number  of  reguests  tc  be 
processed  by  the  system.  Thus  the  user  must  select  between 
the  response  time  desired  and  tne  information  desired,  a 
choice  the  user  does  not  want  to  and  should  not  have  to 
make.  The  database  system  needs  to  be  easily  upgraded  to 
accommodate  new  users  and  to  increase  •  the  database  size 
without  noticeable  change  in  response  time.  Tnis  is  the 
need  for  the  r es^onse^time  invariance  m  a  database  system. 

Another  problem  is  in  the  timeliness  of  a  response.  The 
database  system  should  offer  a  dependable,  constant  return 
rate  fer  the  response  to  a  reguest.  when  response  time 
becomes  unreasonably  long  due  to  the  computer  workload,  the 
user  will  be  frustrated.  A  user  desires  to  have  every 
reguest  returned  in  a  timely  manner.  This  is  the  need  ror 
resjjcnse-time  consistency  in  a  database  system. 

A  final  problem  is  to  insure  that  all  necessary  informa¬ 
tion  is  available  to  the  user.  Incomplete  information  is  of 
little  use.  For  example,  a  user  may  reguire  all  reguests  to 


have  a  response  withir.  a  specified  ti  at  f  ra  me.  This  reguire- 
ment  often  dictates  the  naxisui  size  of  the  database  anc  the 
maximum  ranter  cf  requests.  Therefore,  ar  undcsirearie 
limitation  is  placed  cr  the  aicurt  of  inf  ormatior.  availatlc 
due  tc  the  limitation  cn  database  size.  Again,  the  user  is 
forced  into  making  a  tradeoff  between  the  response  time  and 
tre  available  information.  Never tneless,  despite  the 
response  time,  such  information  should  be  made  available  to 
the  user  on  demand.  This  is  the  need  for  aval lability  or 
information  in  the  database  system. 

Therefore,  not  only  is  there  a  need  for  a  database 
system,  there  is  also  a  need  for  a  database  system  with  the 
guaiities  cf  Invariance,  Consistency,  and  Availability 
(ICA).  Eut  ICA  can  be  present  in  varying  degrees  in  a  data¬ 
base  system.  The  degree  of  ICA  can  best  be  demonstrates  by 
the  performance  measurement  of  the  database  system. 

There  are  two  basic  types  of  database  systems.  The 
first  is  an  online  software  datarase  management  system  that 
runs  cn  the  host  computer  system.  The  second  is  a  dataaase 
machine,  which  offloads  the  database  functions  tc  a  dedi¬ 
cated  backend  computer.  The  current  trends  in  database 
systems  involve  the  design,  implfement ation ,  and  use  cf  data¬ 
base  machines  £  Ref .  1  taro  ugh  8 J.  Net  only  is  there  an 
apparent  improvement  in  ICA  with  a  corresponding  price  per 
performance  advantage,  but  a  datarase  machine  can  free  up 
resources  at  the  host,  provide  support  for  multiple,  dissim¬ 
ilar  hosts,  and  increase  the  security  on  the  database  by  tne 
physical  separation  cf  the  database  and  the  host.  Cue 
primarily  tc  the  trer.d  toward  increasing  ruture  use  cf  data¬ 
base  machines,  this  thesis  will  concentrate  on  the  discus¬ 
sion  and  application  of  the  methodology  for  measuring  the 
database  machines. 

A  database  machine  is  a  database  system  composed  cf  one 
cr  more  processors,  dedicated  to  performing  the  database 


18 


management  functions 


It  is  1  r.Uj.sp'1  taile  that  a  datuias^ 


mac  nine  is  tte 


fetter  of  tne  two  tvves  of  djta^ist  svstems 


wit  a  regards  to  providing  an  increase  in  security,  aiicwinj 
for  auitipie  host  support,  and  freeing  u*.  tr.e  nest 
resources.  5ut  there  still  exists  tne  need  to  demonstrate 
an  improvement  in  the  ICA  on  a  database  machine  ever  the  ICA 
provideu  ty  a  hest-  iesiden t  datanase  system.  At  the  saat 
time,  there  exists  a  need  to  compare  the  invariance,  consis¬ 
tency  and  availability  cf  several  difrerent  database 
irac  bines  and  software  syst  ems.  A  yam,  this  can  rest  he 
demonstrated  by  measureing  these  systems. 

Be f p ense-t ime  consistency  is  more  easily  acnieved  ir  a 
catalase  machine  that  in  a  datanase  system  running  cn  the 
host.  Whereas  the  host  must  share  its  resources  with  a 
varying  workload,  the  backend  can  dedicate  its  resources  for 
database  management.  Availability  frees  the  latanase 
Administrator  from  the  necessity  to  make  tradeoffs  between 
the  sire  of  the  database  and  the  response  time.  The  adminis¬ 
trator  can  then  load  the  database  with  all  the  necessary 
information  regardless  or  the  database  sire.  To  achieve  and 
verify  the  response  time  invariance  of  a  database  machine, 
a  methodology  to  measure  its  effectiveness  must  be 
deve lepe  d. 

Thus,  the  scope  cf  this  thesis  is  to  provide  a  perform¬ 
ance  measurement  methodology  for  database  machines  and  to 
verify  this  methodology  by  verifying  the  design  claims  of  a 
specific  database  machine,  known  as  "IDBS.  Again,  these 
claims  are  related  to  the  guaiity  of  response  time  invari¬ 
ance;  that  is,  to  be  able  to  change  toe  size  of  the  aatacase 
and  at  the  same  time  maintain  constant  response  time  cr  to 
hold  constant  the  size  of  the  database  with  the  ability  to 
reduce  the  response  time.  Conse guen tiy ,  the  measurement  of 
tne  response  time  of  a  database  system  becomes  the  focal 
point  cf  our  studies.  If  the  response  time  can  be  properly 


and  dccurattl;  measured,  the  claims  of  tne  target  system  car. 
he  verified.  7  ur  thermore,  the  e:  f  ec  ti  ver.ess  or  the  aeti.cd- 
c  I  o  g  y  car.  also  re  verified.  A  proper  measurement  cf  the 
response  tire  can  provide  a  baseline  measurement  to  wrier, 
ether  database  systems  can  re  compared  and  thus  prevne  a 
price-performance  comparison  cf  various  systems.  Ti.is 
thesis  provides  an  o  ver  n  ead-f  r  ee  perf  orira  r.ce-measurenent 
methodology  and  applies  this  methodology  to  verify  the 
chains  of  a  r.  experimental  database  machine. 

E.  1 FE  SFFFOACH 

Ir.  this  section,  we  discuss  a  general  methodology  to  re 
used  in  the  performance  measurement  of  a  database  system. 
Tnis  methodology  is  general  and  can  be  applied  to  any  ether 
database  system.  'tie  first  discuss  the  internal  performance 
measurement.  Tnis  irciudes  the  design  consi dera tions ,  the 
software  engineering  criteria  and  the  application  cf  the 
methodology  to  a  particular  system.  Then  we  present  a 
discussion  of  the  external  performance  measurement,  again 
discussing  the  design  considerations ,  tne  software  engi¬ 
neering  criteria,  and  the  application  of  the  methodology  to 
a  particular  system. 

1  .  A  ISisrnax  Peri ormance  Measure m en t 

The  goal  of  the  internal  performance  measurement 
methodology  is  to  provide  methods  and  tools  which  will 
er.acle  us  to  better  understand  the  target  system  by  meas¬ 
uring  certain  aspects  of  that  system.  A  complete  under¬ 
standing  of  how  the  system  performs  internally  may  lead  to 
design  mcdirica tions  or  to  fine-tuning  of  tne  system  nor 
better  performance.  The  internal  performance  measurement 
tools  should  be  unobtrusive  to  the  user,  available  when 
necessarj,  yet  out  of  the  way  waea  not  reguired.  Tney  should 
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te  integrated  with  the  tar  jet  sg  s  tem  to  produce  a  tiooth 
transition  between  target  system  operation  and  tne  operation 
cr  the  tcci.  In  the  first  part  or  t  .us  section ,  we  address 
the  design  considerations  cf  internal  performance  measure¬ 
ment  methods.  Nat,  we  discuss  certain  software  engineering 
criteria  whicii  are  applicable  to  the  design  or  gool  measure¬ 
ment  tools.  Finally,  we  explore  tne  application  cr  the 
internal  rer£or nance  measurement  methodology  to  a  particular 

s  y  S  t  €  H  • 

a.  Design  Considerations 

Internal  performance  measurement  relies  on 
checkpoints  internal  to  the  database  system  software.  A 
checkpoint  is  defined  as  a  procedural  invocation  inserted 
into  tne  system's  flew  cf  control  to  call  the  performance 
measurement  routines  which  are  used  for  the  data  collection. 
System  overhead  is  irtroduced  as  eacn  checkpoint  is  added  to 
tne  target  system.  Additionally,  measurement  software  is 
repaired  tc  process  tie  checkpoint  lata  in  a  manner  compat¬ 
ible  with  the  existing  target  system  soicware.  That  is,  a 
certain  portion  of  tne  measurement  software  must  te  inte¬ 
grated  with,  the  target  system  software  to  handle  events  s  uch 
as  data  storage,  message  jissing,  and  information  processing 
that  relate  to  the  ctecx, oi nt  data.  Finally,  the  existing 
target  system  software  may  require  additional  lines  cf  cede 
to  nandie  new  cases  irtroduced  ty  the  measurement  system. 

Ir.  most  e>ternai  performance  measurement,  over¬ 
head  is  negligible.  However,  internal  aeasurement  routines 
adu  significant  overhead  to  tne  database  system  winch  cannot 
he  disregarded.  For  internal  measurement,  we  must  discover 
ways  tc  reduce  tne  cverr.ead  generated  by  tne  measurement 
software.  Vie  must  also  te  able  to  measure  the  overhead 
wiiich  carnet  te  elimirated,  sc  tnat  tne  measurements  can  be 
adjusted  accordingly.  A  very  important  r  eegu  ireme  r.  t  is  that 


txxe  existing  target  sys tern  must  maintain  tne  caparixity  of 
running  unimpeded  by  tie  additional  measurement  software. 

Consider ation  must  be  given  to  tne  level  in  the 
target  system  where  checkpoints  may  De  piaced.  Some  possible 
levels  are  at  the  very  high  level,  i.e.,  tne  system  level, 
the  big:,  level,  i.e.  ,  the  program  lever,  tne  medium  level, 
i.e.,  the  subroutine  level,  and  the  low  level,  i.e.,  the 
subroutine  segment  level.  whereas  external  performance 
measurement  only  places  checkpoints  at  tne  very  high  level, 
internal  performance  measurement  places  checkpoints  below 
that  level.  Checkpoints  must  he  placed  at  a  level  which 
produces  data  in  sufficient  detail  to  provide  the  user  with 
a  basic  understanding  cf  the  system's  perrormance  character¬ 
istics.  Checkpoints  should  net  be  placed  at  a  level  sc  low 
as  tc  overwhelm  tne  user  with  detailed  data  or  to  interfere 
signif icactiy  with  system  performance. 

For  internal  performance  measurement,  the  user 
should  have  tne  capability  to  access  selected  data  cut  of  a 
range  cf  possible  chcices.  The  user  snould  rot  be  required 
to  receive  information  about  processes  which  are  net  of 
current  interest.  The  interface  should  be  easy  to  use  and 


should  net  distract  the  user  rrem  his  primary  goal  cx  under¬ 
standing  the  database  system  by  requiring  the  user  to 
remember  the  unigue  syntax  or  semantics  of  the  test  inter¬ 
face.  The  collected  measurements  snould  be  made  accessible 
to  automated  processing  routines  for  aata  reduction. 

h.  Software  Engineering  Criteria 

hleasurement  software  should  be  designed  using 
modern  software  engineering  methods.  The  resulting  software 
should  re  understandable,  maintainable,  reliable  and  compat¬ 
ible  with  the  target  system.  Certain  software  engineering 
metneds  are  of  particular  interest.  These  methods  are  modu¬ 
larization,  user-f  r  uendii  ness,  da  ta  abs  tract  ic  r. ,  ani 
simplicity. 


Foe  loduid 
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toe  get  Modularity  allows  the  system  to  tt  easily 
txttr.de  i  to  cnec  kpoints  not  considered  ir.  the  initial  speci¬ 
fications.  The  test  interface  shoall  present  an  easy-tc-use 
method  for  obtaining  test  data.  It  saouid  automatically 
aggregate  data  while  still  allowing  tne  aser  to  access  raw 
data.  The  user  should  not  have  to  remember  the  specific 
syntax  and  semantics  cf  the  test  interface.  Data  abstrac¬ 
tion  should  be  used  sc  that  subsequent  program  modifications 
do  net  result  in  extensive  reprogramming.  An  appropriate 
cnoice  cf  primitives  (  data  structure  and  operations  )  will 
allow  for  easy  change  and  produce  less  system  overhead.  The 
measurement  system  sbculd  be  user- friendly .  In  addition  to 
obeying  the  simplicity  principle,  the  test  interface  should 
be  forgiving,  i.e.,  system  should  not  crash  on  bad  input, 
provide  readable  error  diagnostics,  anticipate  errors,  and 
guard  against  these  errors. 

c.  Issues  in  the  Application  to  Database  Systems 

Application  of  the  internal  performance  measure¬ 
ment  methodology  to  a  particular  database  system  reguires 
that  the  evaluator  understand  certain  aspects  of  the  target 
system.  The  evaluator  must  understand  the  programming 
language  used  to  construct  the  database  system,  and  the 
structure  and  operation  of  the  database  system.  The  evalu¬ 
ator  must  be  prepared  to  overcome  obstacles  presented  by  the 
target  system  in  the  course  of  tne  implementation  cf  the 
performance  measurement. 

A  thorough  understanding  of  the  programming 
language  is  necessary  to  successfully  integrate  checxpcirts 
and  data  collection  programs  into  tne  existing  software 
structure.  Cne  must  be  familiar  witn  the  data  structures. 


control  structures,  caning  conventions,  and  par ameter- 
pas  si  n  ;  me chan is  ms  ci  tne  id  i.  g  u  a  ge  ,  r  n  o  r  o  e  r  to  i or  accent 
tne  measures  et  t  programs  e  r  r  r  c  i  e  n  t  x  y  a  r.  a  t  c  minimise  t  r.  e  i  r 
ever  he  ad.  Knowledge  cf  ti.e  i  a  n  g  u  a  g  e  syntax  reduces  program¬ 
ming  enters  and  steeds  impicTentatr  or.  of  the  measurement 
tools  . 

For  effective  internal  ptrf oraance  teas  are  me  r.t , 
checkpoints  must  he  correctly  placed  An  tne  iatanase  system. 
Incorrectly  placed  checkpoints  increase  overread  and  degrade 
perfcriar.ee  me  a  sure  lent  Lv  providing  useless  data  tc  the 
user.  The  evaluator  oust  possess  safncient  knowledge  on  the 
target  system  to  allow  for  the  correct  placement  of  check¬ 
points.  This  provides  the  smoctn  integration  of  data  eexiee- 
tion  prcgiais,  data  processing  programs  ana  data  transfer 
programs  into  the  existing  datahase  system. 

Chances  are  that  the  target  system,  when 
initially  designed,  was  not  designed  with  internal  perform¬ 
ance  measurement  in  mind.  Instead,  the  target  system  was 
designed  tc  process  ail  reguests  efficiently.  Integration 
cr  the  internal  performance  measurement  routines  may  affect 
tne  target  system  ir.  unexpected  ways.  Let  us  consider  two 
examples  cf  such  ways.  First,  in  a  message-passing  system, 
messages  generated  ty  the  measurement  programs  may  reunite 
modi  f  ica  tio  r.s  to  the  existing  datahase  system  so  tr.at  test 
messages  will  not  he  confused  with  tae  messages  of  the  data¬ 
hase  system.  Second,  tae  volume  of  information  generated  ty 
the  measurement  programs  may  overload  selected  sections  of 
the  target  system.  The  evaluator  of  tne  performance  meas¬ 
urement  routines  must  he  prepared  for  suen  contingencies. 
Ey  using  tr.e  knowledge  of  tne  programming  language  along 
with  tt<=  knowledge  cf  tne  datahase  system,  tne  evaluator 
must  he  prepared  to  offer  solutions  to  the  database  adminis¬ 
trate!  cr.  new  to  gracefully  integrate  the  perform.ar.ee  meas¬ 
urement  mechanisms  into  the  target  system  wit.o  picter 
modif  ica  tier,  and  without  overhead. 
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us  to  letter  understand  tne  tar  jet  system  t  x  aeas  inn  g  tn  _• 


system  as 

a  w  n  ole. 

I n  this  way  we  car. 

n .  tr  a  £  u  ir  t  the 

total 

work  tert; 

cone  ty  t 

le  database  system. 

'■le  t  o c  1  s  or: 

u':do" 

uring  the 

response 

time  of  the  system. 

the  e I d  c  e  e i 

tlx  a 

between  the  issuance  of  a  request  and  the  receipt  cf  tie 
response  tc  tne  request. 

1  n  t  e  rn  a  x  performance  measurement  has  seen  sue w  r  to 
le  beneficial  in  the  fine-tuning  of  a  system,  an  i  in  tne- 
iticr  csccpic  examination  of  the  work  rei  r.g  p  error  ned  ty  the 
system-  External  measurement  provides  a  quantitative  meas¬ 
urement  cf  the  system  from  a  macroscopic  view.  This  allows 
for  the  comparison  of  datanase  systems.  In  tr.e  first  tart 
cf  the  section,  we  discuss  the  dcsija  considerations  cf  tne 
external  performance  measurement  metaois.  Next,  we  present 
tne  software  engineering  criteria  for  external  performance 
measurement.  Lastly,  we  show  tne  application  01  the 

external  performance  measurement  to  a  system. 

a.  Design  Cc  aside  rations 

External  performance  measurement  snouid  have 
negligible  overhead,  i.e.,  the  response  time  with  external 
performance  measurement  should  he  tne  Same  us  the  response 
time  without  measurement  neir.  g  performed.  This  is  in  fact 
tne  case.  The  reason  t:.a  t  t..e  ov^rnead  is  negligible  is 
tha  t  only  two  timing  chec.<  p  mints  r.e-.-u  to  Le  made.  These 
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level  to  insure  a  complete  measurement  or  the  total  elapsed 


time . 


There  are  other 


that  must  re  considered 


to  insure  that  the  system  being  evaluate!  is  as  'pure'  as 
possible.  First,  the  system  should  retain  only  these  cede 
an  1  messages  required  for  the  running  or  tne  system. 
Messages  ar.d  code  incorporated  into  tne  system  fer  tne 
design  cr  debugging  cr  the  system  snoull  be  removed. 
Second,  the  system  shouic  not  contain  unnecessary  software 
toois  designed  tc  aid  tne  measurement,  such  as  those  used  to 


create  a  test  database. 


tools  snouid  remain  in  soft¬ 


ware  exterior  to  the  actual  database  system. 

An  obvious  consideration  is  to  insure  that  no 
human  interaction  is  involved  in  the  timings.  The  system 
software,  net  the  reaction  time  cf  the  user,  is  being  timed. 
Tnerefore,  the  timer  should  start  immediately  after  a  user 
releases  the  request.  The  timer  should  stop  immediately 
prior  to  the  display  on  the  selected  output  device.  Ine 
reason  for  stopping  the  timer  prior  to  display  is  due  tc  the 
varying  delays  caused  by  the  output  devices.  The  speed  of 
an  cutcut  device  should  net  be  included  into  the  system 
timing  results. 

The  final  issue  involving  the  placement  of 
external  performance  measurement  checkpoints  is  whether  to 
embed  the  tinier  code  in  the  system  or  to  call  a  timer 
routine  outside  the  system.  A  call  to  a  timer  routine 
incurs  unwanted  timing  delays,  adding  to  tne  impurity  of  a 
system.  if  the  timer  code  is  embedded,  it  car.  be  mace  to 
appear  that  the  system  code  being  tested  is  embedded  in  the 
timer  cede,  i.e.,  placing  the  timer  initialization  cede  just 
prior  tc  the  point  of  the  request  by  tne  user  and  the  timer 
fine  lizaticr.  code  just  subsequent  to  tne  display  cn  the 
output  device.  Kith  these  considerations,  an  optimal  place¬ 
ment  or  checkpoints  can  be  selected  to  take  external 


perfcraar.ee  timings. 


b. 


Software  hngineenng  Criteria 


n 


Unlike  icternal  performance-measurement  software 
which  uses  software  design  setnodologies,  the  external 
f er f cr lance-measuremer.t  software  uses  software  design  tools. 
In  [Eef.  a  lull  description  is  provided  of  tne  necessary 
external  peifor mance-measur  emer.t  tools.  These  tools  include 
a  test-file  generation  package,  a  database  load  subsystem, 
and  a  request  generation  package. 

The  purpose  or  the  test-file  generation  package 
is  tc  create  a  test  database.  T».is  allows  for  the  easy 
creation  of  a  database  containing  tne  desired  parameters  to 
re  evaluated.  The  oatanase  load  subsystem  must  prcterly 
loaf  the  files  treat  d  m  the  generation  package.  This 
includes  the  creation  cf  directories  for  the  test  database. 
The  reguest  generation  package  is  used  to  create  and  execute 
test  reguests,  and  provides  ter  easy  variance  in  the  types 
and  complexity  of  requests.  This  package  also  archives  the 
requests  xcr  later  tse.  Using  these  tools,  the  external 
performance  timings  cf  the  database  system  under  measurement 
can  be  easily  obtained. 

c.  Issues  in  tne  Application  of  tne  Methodology 

The  ease  with  which  external  performance  meas- 
uremert  can  be  performed  on  a  database  system  car.  vary. 
Tnere  are  two  important  considerations:  the  language  in 
which  the  system  is  written  and  the  degree  of  software  engi¬ 
neering  used  in  the  database  system  design. 

The  language  needs  to  ce  readable  and  tc  ccmrli- 
ment  proper  documentation  of  the  system.  This  will  facili¬ 
tate  an  understanding  cf  the  system  by  the  system  evaluator. 
The  language  must  also  be  powerful  enough  to  easily  incorpo¬ 
rate  system  commands,  such  as  reguests  for  the  system  time. 
A  language,  such  as  C,  has  these  capabilities,  beir.^, 
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primarily  designed  fci  system  programming.  C  is  a  r.i3h- 
ltvel  language,  that  is  both  powerful  and  portable. 
Although  the  support  software  tools  such  as  datarase  lead 
can  he  i up lerae nted  in  a  language  otner  than  tue  language  in 
which  tie  database  system  was  written,  the  evaluator  needs 
to  be  familiar  with  several  different  languages  if  several 
different  database  systems  are  to  be  evaluated. 

The  degree  of  software  engineering  used  in  tie 
database  system  design  will  most  definitely  facilitate  any 
external  performance  measurement  to  be  done.  If  the  data¬ 
base  system  was  hierarchically  designed  using  modularity, 
knowledge  cf  the  internal  workings  of  the  system  iy  the 
evaluator  will  be  minimal.  Only  the  upper  level  in  the 
hierarchy  need  to  be  studied  for  the  proper  placement  of  the 
checkpoints.  External  measurement  only  requires  a  macro 
knowledge  of  the  system.  This  is  to  insure  that  the  check¬ 
points  are  indeed  properly  placed  at  the  very  high  level. 

C.  TEE  COMBINATION  CF  IN  TEE  Nil  AND  EXTERNAL  PERFORMANCE 

H  F ASCBEMENTS 

Separately,  internal  and  external  performance  measure¬ 
ments  provide  a  wealth  of  information  to  the  evaluator. 
Internal  performance  measurement  provides  the  timings  and 
data  collections  of  individual  processes  in  the  database 
system.  External  performance  measurement  provides  the 
elapsed  time  for  the  complete  reguest.  Yet,  when  the  two 
methodologies  are  combined,  there  is  a  synergistic  efxect  to 
the  amount  cf  information  available  to  the  evaluator. 

The  com bi nation  cf  internal  and  external  performance 
measurements  is  natural.  There  are  benefits  to  te  gained 
for  one  from  the  other.  For  example,  we  can  determine  the 
overhead  incurred  when  using  internal  performance  measure¬ 
ment;  first,  using  the  external  checkpoint,  we  collect  the 


elapsed  tiie  for  t  recessing  a  particular  request.  Iris  tame 
is  then  compared  to  the  elapsed  tilt  or  the  request  w  ter. 
Loth  internal  and  external  ebeex^omts  arc  enarled.  In  e 
difference  in  the  elapsed  tames  of  taese  two  measurements 
pro  vide s  an  exact  measurement  of  the  overhead  incurred  by 
the  irterr.al  performance  measurement  software  fer  t^is 
request. 

C  r.  tie  ether  hand,  we  can  use  the  internal  perfcrmar.cc 
measure rent  'timinqs  tc  interpret  the  external  perrcrmar.ee 
measurement  timinqs.  In  particular,  if  a  request  takes  many 
hundredths  cf  a  seccrd  as  a  result  of  external  performance 
measurement,  the  evaluator  would  want  to  determine  the 
precise  distribution  of  the  work.  Internal  performance 
measurement  can  answer  these  questions.  By  combining  the 
two  measurements,  tie  whole  cf  the  measurement  results  is 
more  meaningful  and  tsefui  tnan  the  individual  results. 

In  the  following  chapter,  the  target  system,  i.e.,  :1IB5 
is  described.  This  is  the  system  selected  to  be  evaluated 
using  tie  internal  and  external  performance  measurement 
methodologies  presented  in  this  chapter. 


IT:.  THS  HULTI-BACK2 RD  CAT  A3 AS  2  SYSTEM  (MDBS) 


Id  t  r.is  cna  pter  we  discuss  ti.e  conf  uratior.  ar.d  theory 
cz  operation  01  tae  multi-tack  end  database  system  <MC5SJ. 
This  chapter  has  been  extracted  from  papers  and  reports 
which  have  been  written  on ■ MDBS  [Ref.  6,  10,  11,  12]. 

MIES  uses  twc  or  sore  identical  minicomp  uter  s  ar.d  tneir 
drsA  systems  to  provide  a  centralized  database  system  with 
support  for  multiple,  dissimilar  hosts.  One  minicomputer 
functions  as  the  controller.  User  access  is  accomplished 
through  a  host  computer  which  in  turn  communicates  with  the 
controller.  Multiple  minicomputers  and  their  disKs  are 
configured  in  parallel  to  serve  as  backends.  The  original 
design  and  analysis  cf  MDBS  is  due  to  J.  Menon  [fief.  1,  2]. 
The  implementation  and  new  design  efforts  are  documented  in 
[Ref.  3  through  6].  The  database  is  distributed  across  all 
cf  the  iacxends.  The  database  management  functions  are 
replicated  in  each  backend. 

As  shown  in  Figure  4.1,  the  controller  and  the  backends 
are  connected  by  a  broadcast  cus.  When  a  transaction  is 
received  frcm  the  host  computer,  the  controller  broadcasts 
tr.e  transaction  to  all  the  backends.  Each  backend  has  a 
number  cf  dedicated  disk  drives.  Since  the  data  is  distrib¬ 
uted  across  the  backends,  a  transaction  can  be  executed  by- 
ail  hackends  concurrently.  Each  backend  maintains  a  3ueue  of 
transactions  and  schedules  reguests  for  execution  inde¬ 
pendent  cf  the  other  backends,  in  order  to  maximize  its 
access  operations  acd  to  minimize  its  idle  time.  The 
controller  does  very  little  work.  It  is  responsible  for 
broadcasting,  routing,  and  assisting  in  tne  insertion  cf  new 
data.  The  backends  do  most  of  the  database  operations. 

,  MD3S  is  fully  operational  with  a  VAX  1  1/783  as 
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the  controller  and  two  PDF  1  1/ 4 ^  ana  tneir  aisks  at  the 
backer ds . 

£155  is  a  lessaje-oricn  tea  system.  In  a  messa ge— cri en ted 
system,  each  process  corresponds  to  one  system  rur.cticn. 
These  riocesses,  then,  communicate  among  themselves  by 
passirg  messages.  User  reyaests  are  passed  between  processes 
as  messages.  The  message  paths  between  processes  are  fixei 
for  the  system.  The  MDBS  processes  are  created  at  system 
start  tine  and  exist  until  the  system  is  stopped. 

MIES  is  designed  to  perform  the  primary  datanase  opera¬ 
tions,  iNSEFT,  DELETE,  UPDATE,  and  RETFIZVE.  Of  these  roar 
database  operations,  only  the  retrieval  operation  will  be  or 
concern  to  us  in  this  thesis.  The  syntax  and  semantics  of 
tne  retrieve  operation  is  discussed  in  Chapter  V.  Users 
access  MCES  through  the  host  by  issuing  either  a  reguest  or 
a  transaction.  A  transaction  is  a  set  of  reguests.  A 
is  a  primary  operation  along  with  a  qualification.  A 
i£j£i££  is  used  to  specify  the  information  of  the 
catalase  tbat  is  to  be  accessed  by  the  reauest.  here 
complete  definitions  of  the  ME ES  terminology  can  be  found  in 
the  following  section. 

In  the  remainder  of  tnis  cnapter  we  first  discuss  the 
directory  structure.  Next,  we  provide  an  overview  of  the 
process  structure.  Then,  a  presentation  of  the  message  types 
is  provided.  Lastly,  we  trace  tne  execution  sequence  of  a 
retrieve  reguest. 

A.  TEE  ATTRIBUTE— BASED  DATA  MODEL 

In  this  section  we  discuss  the  a ttrin ate— based  data 
model.  Next  we  provide  some  definitions  in  order  to  discuss 
MDBS  directory  data.  We  conclude  tnis  section  by  describing 
the  tables  necessary  to  maintain  the  MDBS  directory 
in formation. 


In  the  attriLute-hased  data  model,  data  is  modeled  wxth 
the  constructs:  datarase,  file,  record,  at  tribute- value 
pair,  directory  keyword,  directory,  record  body.  Keyword 
predicate,  and  guery.  Informally,  a  database  consists  of  a 
collection  of  files.  Each  file  contains  groups  of  records 
which  are  characterized  by  a  unigue  set  of  directory 
keywords.  A  record  is  composed  of  two  parts.  The  first 
part  is  a  collection  of  at  tribute— value  pair  s  or  keywords. 
An  attribute— value  pair  is  a  member  of  the  Cartesian  product 
of  the  attribute  tame  atd  tae  value  domain  of  the 
attribute.  As  an  example,  <P0P'J1ATI0N,  25000>  is  an 
attribute-value  pair  having  25000  as  the  value  for  the  popu¬ 
lation  attribute.  A  record  contains  at  most  one  attribute- 
value  pair  for  each  attribute  defined  in  the  database. 
Certain  at  trib  u  t  e— val  ue  pairs  of  a  record  (or  a  file)  are 
called  tie  director  y  key wor  ds  of  the  record  (file)  ,  because 
either  the  at  trioute- value  pairs  or  their  attribute- va^ue 
ranges  are  kept  in  the  directory  lor  addressing  the  record 
(file).  Those  attribute-value  pairs  which  are  net  kept  ir. 
the  directory  for  addressing  tne  record  (file)  are  called 
non-directcry  keywords.  The  rest  of  the  record  is  textual 
information,  which  is  referred  to  as  the  record  bogy.  An 
example  of  a  record  is  shown  below. 

(  <  f  II E,  Census>,  <CIT¥,  Mcnterey>,  POPULATION,  25CCO, 

{  Temperate  climate  }  ) 

The  angle  brackets,  <,>,  enclose  an  attribute— value  pair, 
i.e.,  keyword.  The  curly  brackets,  (,},  include  tat  record 
body.  The  first  attribute-value  pair  or  ail  records  of  a 
file  is  the  same.  In  particular,  tne  attribute  is  flLE  and 


The  database  is  accessed  by  index  in g  or.  directory 
key  '.crus  us  mg  keyword  teed  icates .  A  keyword  predicate  is  a 
tnree-tuple  consisting  of  an  attribute,  a  relational  eper- 
ator  (=,  * ,  >,  <,  >,  <)  ,  and  an  attribute  value,  i.e., 
FOP’JIATICN  Z  20 0C0  is  a  keyword  predicate.  More  specili- 
caiiy,  it  is  a  grea ter-t han-o r— e 3uai- to  predicate. 
Combining  keyword  predicates  in  disjunctive  normal  :cri 
cnar act erizes  a  g u e r y  of  the  datarase.  The  query 

(  FILE  =  Census  and  CITY  =  Monterey  )  or 
(  FILE  =  Census  and  CIIY  =  San  Jose  ) 

will  be  satisfied  by  all  records  of  the  Census  file  witr.  the 
CITY  or  either  Monterey  or  San  Jose.  For  clarity,  we  also 
employ  parentheses  fer  bracxeting  predicates  in  a  guery. 

Fecall  that  m  MBS  there  are  four  types  or  reguests 
Wiiich  correspond  to  the  four  primary  database  operations.  An 
example  cf  a  retrieve  request  would  be: 

RETRIEVE  (  FILE  =  Census  and  POPULATION  >  1  0000  )  (CIIY) 

which  retrieves  the  tames  of  all  tnose  cities  in  tne  Census 
rile  wncse  fopulation  is  greater  than  10000.  Notice  that 
the  q  ua  1  if  icat  ion  component  cf  a  retrieve  reguest  consists 
of  twe  pares,  the  guery  of  two  predicates  (  FILE  =  Census 
and  FCPOLATION  >  100CC  )  and  the  target  list  (CITY).  Tne 
query  speciiies  which  recorus  or  tne  datarase  are  tc  be 
retrieved.  Tne  target  list  specifies  the  attribute-value  (s) 
to  be  returned  to  the  user.  A  user  may  wish  to  treat  two  or 
more  revests  as  a  t  ran  sac  tier..  In  this  situation,  MDBS 
executes  tne  reguests  of  a  transaction  without  permuting 
tnem,  i.e.,  if  T  is  a  transaction  containing  the  reguests 
<  :•■  1  >  <  F  2  >  ,  tier.  MDBS  executes  the  reguest  R1  before  request 
F2.  firaily,  we  define  the  term  traffic-unit  to  represent 
exti.er  a  single  request  or  a  transaction  in  execution. 


E.  Ill  EIF.ECTORY  TAE1ES 

1c  iana.;e  ti^  datarase  (often  refenei  t j  as  as-a  iata)  , 
.'‘EEC  u£t£  directory  data.  Directory  aata  in  .'IDES  corresponds 
to  attributes,  descriptors,  and  clusters.  An  attrit  ate  is 
used  to  represent  a  category  of  tne  user  deta;  e.  ..  , 
PCPdlAIICH  is  an  attxitute  that  corresponds  to  ict’ai  pofu- 
laticr.s  stored  in  the  database.  A  ifscri^.  tor  is  usei  to 
descrile  a  range  of  values  that  an  attrnute  car.  have;  e-i, 
(100C1  i  POPULATION  f  1500  0)  is  a  oossitie  descnptcr  for 
t.ie  attrifute  POPULATION.  Tne  descriptors  tnat  are  dtiined 
for  a r.  attribute,  e.g.,  pot  ulation  ranges,  are  mutually 
exclusive.  How  the  notion  of  a  cluster  can  re  defined.  A 
cluster  is  a  group  of  records  sucn  that  every  record  in  tne 
cluster  satisfies  the  same  set  of  descriptors.  For  example, 
all  records  with  POPUIATION  between  1  000  1  and  1  5000  ray  icrr. 
cr.e  cluster  whose  descriptor  is  the  one  given  above.  In  this 
case,  the  cluster  satisfies  the  set  of  a  single  descriptor. 
In  reality,  a  cluster  tends  to  satisfy  a  set  of  multiple 
descriptors . 

Directory  information  is  stored  in  three  taries:  the 
Attribute  Table  (AT) ,  the  Descr iptor-to— Descriptor-Id  Taile 
(EDIT)  and  the  Cluster-Definition  Taile  (CDT)  .  The  Att r in ute 
Table  maps  directory  attributes  to  tne  descriptors  derined 
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Figure  4.2  An  Attribute  Table  (AT) 


Figure  4.3  A  Descriptor-to-Descriptor-Id  Table  (EDIT). 

to  the  AT  table  cf  Figure  4.2.  The  Cluster— Definition  Table 
maps  descriptor— id  sets  to  cluster  ids.  Eacn  entry  consists 
cf  the  unique  cluster  id,  the  set  of  descriptor  ids  wncse 
descriptors  define  the  cluster,  and  the  addresses  cf  the 
records  in  the  Clusters.  A  sample  CDT  is  snown  ir.  Figure 
4.4.  Thus,  to  access  tne  directory  data,  we  must  access  tne 
AT,  E  E  IT  ,  and  CD  T. 

Cne  cf  the  *ey  concepts  used  wnen  designing  the  test 
database  (see  Chapter  7.)  is  defining  the  descriptors  which 
are  specified  in  the  directory  attributes.  Thus,  we  provide 
a  brief  introducticr  to  the  three  c  lassi  f  ica  t  ic  r.  s  of 


descriptors.  A  tyj:e-A  descriptor  is 


conjunction  or  a 
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Figure  4.4  A  Clust er-Eef inition  Table  (CD!) . 

les s- than— c r-eq ual— tc  predicate  and  a  grea ter-than— or-e : a  al¬ 
to  predicate,  such  that  the  same  attribute  appears  m  ret:, 
predicates.  An  example  or  a  type— A  descriptor  is  as 
follows : 

(  (ECEULATICN  >  10000)  and  (POPULATION  <  15000)). 

A  type-2  descriptor  consists  cf  only  an  equality  predicate. 
An  example  cf  a  type-E  descriptor  is: 

(FILE  =  Census)  . 

Finally,  a  ty ,-e-C  descriptor  consists  of  the  name  cr  an 
attribute.  The  type-C  attribute  defines  a  set  of  type— C 
sub-descriptor s.  Type— C  sub-descriptors  are  equality  predi¬ 
cates  defined  over  all  unique  attribute  values  wmer  exist 
in  the  database.  Fcr  example,  the  type-C  attribute  CITY 
forms  the  type-C  sub-descriptors 

(Cl  TY  =  C  umberia  nd)  ,  (CI7Y  =  Columbus) 

where  "Cumberland"  and  "Columbus"  are  tne  only  unique  data¬ 
base  values  for  tne  CITY. 

C.  TEE  EBCCESS  STRUCTURE 

Currently,  iDES  does  r.ct  communicate  witn  a  best 
macaine.  The  aiser.ee  cf  this  ccaauaicatior.  requires  tr.at  the 
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t  os  l  interlace  pi ocess,  the  ftcccis  itti  to  interact 
d53,  he  placed  i r.  the  "IBS  Ccr.tr oil  :-r.  The  current  i 
ler.ta  tier,  c  r  MDBS  does  not  utilize  a  nroa  react  _  us .  Inc 
:i:BS  utilizes  parallel  ccmmunica  ti  or.  a  lines  (rCLi) 
t:  uiate  a  h  road  cast  bus.  Both  the  controller  ir.  c  tht 
ends  ccrtair  processes  to  interface  wit:,  ti.e  ids  fer  i 
computer  message  passing.  .Lest  processes,  wuxxc  r  ect 
to  interface  with  the  PCLs ,  art  not  ruit  or  Id 2 i  and 
not  he  discussed  further.  Injure  u.5  . roviuts  an  eve 
ci  the  hi  2 S  Process  Structure. 
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t:. 


.he  controller  is  u.>*.  .  :  r  t  n  r  <.  e  :  r  c  c  e 

r.ejuest  Preparation,  Insert  Infer:  it  i  net  uti  j.-.c 

Process  i  r  j .  ?.  eouest  Pro  tar  it  ion  .«  -  s  ,  rirs<-  ■ 

formats  a  reeucst  (transact  ion)  it  rot  ..  st-n  _m  :  t ;.  e  f  err 
reu  lest  (transaction)  tc  the  I  i  rector  p  .'•ana  je  t  piece 
each  hacker.c.  Insert  Ir.f  err  a  tio..  (oiei  ati  on  is  nc 
provide  additional  in  fora  ati  cr.  to  tat  Ladens  w ..  e 
insert  request  is  received.  Since  tne  uata  is  distrir 
the  insert  cr.ly  occurs  at  one  of  t.ie  r  a  c, -lends.  In  us  it 
determine  the  lock  end  at  which  the  insert  wixx  occur, 
with  the  cluster  and  descrirtcr  iia  ror  the  insert. 
Process  in j  is  useu  tc  collect  ail  the  results  of  a  rt 
(transaction)  and  forward  the  information  rack  to  tne 
computer. 


2.  1  i.e  Processes  cf  Each  Hack  end 


: a c i,  LacKend  is  alsc  ccaiposeu  of  three  treat 
They  arc  cf  course  different  from  the  controller  piece 
Hey  are:  lirector  y  far.a  je  oier.  t ,  Concurrency  Control, 
Record  I-  roc  ess  i  n  c.  Directory  :iana  jemen  t  performs  Descr 
Search,  Cluster  Search,  an  d  A  ddress  Generation.  Descr 
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reguest.  Cluster  Search  fires  the  exuster  ids.  Aacress 
Generation  determines  tne  secondary  storage  addresses  neces¬ 
sary  tc  access  the  clustered  records.  Concurrency  Control 
detemines  when  the  request  can  be  executed.  Record 
Processing  performs  the  operation  specified  a y  the  reguest. 

E.  TE E  BEES  MESSAGE  TYPES 

In  this  section  we  describe  the  MDBS  message-passing 
facilities  first  described  in  £Eef.  13],  In  the  MZ3S 
lessa ge— passin g  facilities  there  are  31  message  types  and 
cne  general  message  format  (shown  in  Figure  4.6).  This  same 


format  is  used  for  each  of  the  three  message  passing  facili¬ 
ties,  namely,  messages  within  the  controller,  messages 
within  a  hackend,  and  messages  between  computers.  Messages 
between  computers  are  divided  into  two  classes,  messages 
between  iackends,  ar.d  messages  between  the  controller  and 
tne  backends.  Figure  4.7  describes  eacn  of  tne  MDBS  message 
types.  Figure  4.3  describes  tne  abbreviations  used. 


Mess  age  Type 
Message  Sender 
Message  Receiver 
Message  Tent 


(a  numeric  code)  . 

(a  numeric  code)  . 

(a  numeric  code)  . 

(an  aipnanumeric  field 
terminated  by  an  end 
cf  message  marker). 


Figure  4.6  The  General  Message  Format. 


4  G 


MZSSA  C  E-TYPE  NUMBER  A  NO  SAM 

E 

SR  J 

1 

L_ 

DE5I  |  PATH 

1 

TRAFFIC  UN IT 

HOST 

1 

RECP  1  HC 

REQUEST  RESULTS 

?P 

1 

HOST 

1  cr: 

NUMBER  OF  REQUESTS  IK 

i 

REQ? 

j 

| 

P?  i 

j  c 

A  TRANSACT  *ON 

1 

AGGREGATE  OPERATORS 

j 

REQ? 

j 

[ 

PP  1 

c 

c 

REQUESTS  WITH  ERRORS 

3 

R  E  C  P 

l 

3 

PP  ; 

3  C 

PArSED  TRAFFIC  UNIT 

i 

E  SQP 

DM 

CE 

NEW  DESCRIPTOR  ID 

II G 

DM 

CE 

EACKEND  NUMBER 

j 

IIG 

DM  | 

CE 

CLUSTER.  ID 

1 

DM 

IIG 

BC 

1  c 

REGUEST  FCF  K E  V 

1  0 

DM 

10 

IIG  1C 

EC 

DESCRIPTOR  II 

1 

EACKEND  FESULTS 

1 

R  EC  ? 

i 

F?  i 

EC 

EOF  A  F,  E  Q  U  E  S 1 

EACKEND  AGGREGATE 

1 

EEC  P 

i 

P?  | 

EC 

CREFATOR  RESUITS 

FECORD  THAT  HAS  CHANGED 

1 

R  EC  P 

i 

REQ?  ] 

EC 

CLUSTER 

i 

RESULTS  OF  A  RETRIEVE 

1 

RECP 

l 

EZQP  i 

EC 

CR  FETCH  CAUSIE  BY 

1 

I 

AN  UPDATE 

I 

j 

j 

L 

1  5 

EESCF.IPTOF.  IDS 

1 

DM 

1! 

3 

DMs  1 ! 

REQUEST  AND  DISK  ADDRESSES 

1 

DM 

1 

RECP  j 

£ 

CHANGED  CLUSTER  RESPONSE 

1 

1 

DM 

! 

! 

RECP  ! 

- 

FETCH 

1 

DM 

1 

REG?  | 

C 

CII  AND  NEW  VALUES  OF 

1 

RECP 

| 

i 

DM 

- 

ATT  PIE  UT  E  BEING  MODIFIED 

j 

i 

I 

J 

2  C 

T'iPZ-C  ATTRIBUTES  FOR  A 

2 1 

) 

DM 

21 

) 

CC  2 1 

)  £ 

TRAFFIC  UNIT 

j 

j 

| 

L ESC— ID  GROUPS  FOR  A 

i 

DM 

CC  i 

3 

TRAFFIC  UNIT 

i 

1 

CLCSTER  IDS  FOR  A 

DM 

I 

CC  | 

E 

1FAFFIC  UN  II 

1 

RELEASE  ATTRIBUTE 

DM 

J 

CC  I 

c 

RELEASE  ALL  AT T F I3UTES  FOR 

DM 

1 

CC  j 

£ 

AN  INS ER  I 

I 

J 

I 

*■>  j“ 

4-  -/ 

RELEASE  DESCRIPTOR— ID 

25 

DM 

2 ; 

3 

CC  2: 

:  E 

GROUPS 

ATTRIEUTE  IOCKFE 

CC 

DM 

DESCRIPTOR-ID  GROUPS  LOCKED 

CD 

1 

DM 

E 

CLCSIEE  IDS  LOCKED 

CC 

1 

DM 

1  3 

29 

NC  MORE  GENERATED  INSERTS 

RECP 

REQ  ? 

EC 

29 

NC  MCFZ  GENERATED  INSERTS 

REQ? 

DM 

1  CE 

29 

NC  MORE  GENERATED  INSERTS 

DM 

F.ECP 

EC 

30 

REQUEST  ID  OF  A  FINISHED 

30 

RECP 

30 

CC  3 

= 

/ 

FEQ  UEST 

I 

1 

1 

2  1 

AN  UPDATE  REQUEST  HAS 

R  EC  P 

1 

DM  | 

| 

FIN ISHED 

1 

j 

| 

3  1 

AN  UPDATE  REQUEST  HAS 

| 

1 

DM 

i 

1 

CC  1 

1  E 

FINISHED 

1 

1 

1 

Figure  4.7  MCES  Message  Types. 
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Figure  4.8  MDBS  Message  Abbreviations. 

C  c  ir  aun  i  c  at  ion  between  computers  in  MDBS  is  achieved  by 
using  a  time-di visicn-mult  if  lexed  bus  called  the  parallel 
com  a  u  r.  ic  at  icn  link  (PCI)  [Eef.  l4j.  MDBS  contains  a  soft¬ 
ware  interlace  to  this  bus  for  each  computer  consisting  of 
two  cc d p lament  a r y  processes.  The  first  process,  get-pcl, 
gets  messages  from  other  computers  off  the  PCL.  The  second 
process,  put-pci,  puts  messages  on  the  bus  to  be  ser.t  to 
ctiiet  computers.  The  controller  and  each  backend  have  their 
cwn  get-pcl  and  put-pel  processes. 

In  the  remainder  cf  this  section,  we  give  short  descrip¬ 
tions  of  the  definitions  of  MDBS  messages.  These  defini¬ 
tions  are  of  the  form: 

(message-type  number)  message— type  name:  explanation  of 

1^3  S  d 36 • 

The  descriptions  will  be  given  by  the  process  tnat  receives 
the  message.  These  descriptions  are  in  following  figures: 
Eeguest  Preparation  (Figure  4.S),  Post  Processing  (Figure 
4.1)),  Directory  Management  (Figure  4.11),  Record  Processing 
(Figure  4.12),  Concurrency  Control  (Figure  4.13),  Host 
processed  for  Test  Interface  (Figure  4.14), 

Information  Generation  (Figure  4.1b). 
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and  Inser 


(1)  nest  Traffic  Unit  :  The  traffic  unit  represents  a 
single  request  or  transaction  from  a  user  at  the 
hcst  machine. 

(13)  Eeccrd  that  has  Changed  Cluster  :  Tms  message  is 
a  record  which  has  changed  cluster,  Request 
Preparation  will  prepare  it  as  an  insertion  and 
sena  it  to  the  tacker.ds. 

( 2  S )  No  More  Generated  Inserts  :  Tms  message  indicates 
that  all  the  records  that  have  changed  cluster  as 
a  result  of  ar.  urda te  request  have  -Been  sent  to 
Eeguest  Preparation. 


(14)  Results  of  a  fetch  or  Retrieve  Caused  bv  an  Update: 
This  message  carries  the  information  froQ  a  fetcn 
or  retrieve  tack  to  Eeguest  Preparation  to  complete 
an  update  with  a  type-lll  or  a  type— IV  modifier. 


Figure  4.9  Eeguest  Preparation  Messages. 


(3)  Number  of  Recuests  in  a  Transaction  :  Reguest 
Preparation  sends  to  Pest  Processing  the  number 
cf  reguests  ir  a  traffic  unit.  This  enables  Post 
Processing  tc  determine  whether  the  processing  cf 
a  traffic  unit  is  complete. 

(4)  Aggregate  Operators  :  Eeguest  Preparation  sends 
the  aggregate  operators  to  Post  Processing. 

(3)  Eeguests  with  Errors  :  Requests  with  errors  will 
te  found  in  Eeguest  Preparation  by  the  Parser  ard 
sent  to  Post  Processing  directly.  Post  Processing 
will  send  requests  with  errors  back  to  the  hcst. 

(11)  Results  cf  a  Eeguest  from  a  Backend  :  This  message 
contains  the  results  that  a  specific  backend  feund 
fer  a  request. 

(12)  Aggregate  Operator  Results  from  a  Eackend  :  When 
an  aggregate  operation  needs  to  be  done  on  the 
retrieved  records,  each  backend  will  do  as  muen 
aggregation  as  possible  in  the  aggregate  operation 
function  of  Record  Processing.  This  message 
carries  those  results  to  Post  Processing. 


Figure  4. 1C  Post  Processing  Messages. 


(6)  Parsed  Traffic  Unit  :  Inis  is  the  prepared  traffic 
unit  sent  by  Fequest  Preparation. 

(29)  Nc  More  Generated  Inserts  :  This  message  indicates 
that  insert  request  for  ail  the  records  that  have 
changed  cluster  as  a  result  of  an  update  reguest 
have  beer  generated  and  sent  to  Directory 
Management. 

(7)  New  Descriptor  Id  :  This  message  is  a  response  to 
tie  Directory  Management  request  for  a  new 
descriptor  id. 

(6)  Eackend  Number  :  This  message  is  used  to  specify 
which  backend  is  to  insert  a  record. 

(15)  Descriptor  Ids  :  This  message  contains  the  results 
cf  descriptor  search  by  Directory  Management. 

(15)  Cld  and  New  'Values  of  attribute  being  Modified  : 
Feccrd  Processing  uses  this  message  to  check 
whether  a  reccrd  that  has  been  updated  has  char,  it  d 
cluster. 

(31)  An  Update  Request  has  Finished  :  Record  Processin 
signals  Directory  Management  tnat  an  update  reque 
has  finished  execution. 


Figure  4.11  Directory  Management  Messages. 
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(2C)  lyp 


Attributes  for  a  Traffic  Unit 


Ccncurrenc 


Control  takes  the  attributes  in  this  lessage  arc 
determines  when  Descriptor  Search  for  an  attribute 
can  be  performed. 

(2  1)  Descriptor— id  Groups  for  a  Traffic  Unit  : 

Concurrency  Control  takes  the  descriptor-id  croups 
in  this  message  and  determines  when  Cluster  search 
for  a  request  can  be  performed. 

(22)  Cluster  Ids  for  a  Traffic  Unit  :  Concurrency 

Ccntrol  takes  the  cluster  ids  in  this  message  and 
determines  wten  a  request  can  continue  with  Address 
Gen  era ti on  a  ’  ' 1 


the  rest  of  request  execution. 

j  v  =>~  — 


45 


(I)  r.fccuest  Results  :  Contains  the  results  for  a  recuestj 
after  ceing  cclltcted  from  ail  toe  backends  arc'  | 
acgregated,  if  necessary.  | 


Figure  4.14  Host  Messages. 


Cluster  Id  :  lirectcry  Management  sends  a  cluster  | 
id  to  Insert  Informatict  Generation  for  an  insert  | 
request.  IIG  will  decide  where  to  do  the  insert.  j 

Request  for  New  Descriptor  Id  :  Waen  Directorv  | 
Mat  agement  has  found  a  new  descri^  tor ,  it  is  sent  | 
to  Insert  Information  Generation  to  generate  an  id.| 


Figure  4.15  Insert  Information  Generation  Messages. 


E.  1 E  F  EXECUTION  OF  A  RETRIEVE  REQUEST 

In  this  section,  we  descrite  the  sequence  of  actions  for 
a  retrieve  request  as  it  moves  througn  MDBS.  The  sequence  of 
acticrs  will  be  descrired  in  terms  of  the  messages  passed 
between  the  MDBS  processes:  Request  Preparation  (REQF), 

Insert  Information  Generation  (IIG),  Post  Processing  (si). 
Directory  Management  (DM),  Record  Processing  (RECF)  and 
Concurrency  Control  (CC) .  For  completeness,  we  descrite  the 
actions  which  require  data  aggregation. 

First  the  retrieve  request  comes  to  REV?  from  the  host. 
In  the  present  i nplenentaticn,  it  comes  from  tne  controller. 


(5) 

(1C) 


EEQP  sends  two  messages  to  PP:  the  number  of  requests  in  the 
transaction  and  the  aggregate  operator  of  the  request.  The 
third  message  sent  by  REQP  is  the  parsed  traffic  unit  which 


goes  to  CM  in  t t  backenas.  EM  sends  tne  type— C  attributes 
needed  by  the  request  to  CC.  Since  type— C  attributes  ray 
create  new  type-C  s  ub-aesc  r  ir  tor  s,  tr.e  typ  e-C  attributes 
must  be  locked  by  CC.  Once  an  attribute  is  locked  and 
descriptor  search  can  be  performed,  CC  signals  DM.  CM  will 
then  perron  Descriptor  Starcn  on  m/n  predicates,  where  r. 
is  the  number  of  predicates  specified  in  the  guery,  and  n  is 
tne  number  cf  backends.  DM  then  signals  CC  to  release  the 
lock  cn  that  attribute.  DM  will  broadcast  the  descriptor  ids 
for  the  reguest  to  the  other  backends.  DM  now  sends  the 
descr  ip  ter- id  groups  for  the  retrieve  reguest  to  CC.  A 
desc r ip t crr id  group  is  a  collection  of  descriptor  ids  which 
define  a  set  of  clusters  needed  by  the  recuest. 
Desc r ip t cr-id  groups  are  locked  by  CC,  since  a  descrip tcr-id 
group  may  define  a  new  cluster.  Once  the  descriptcr-ii 
groups  are  locked  ana  Cluster  Search  can  be  performed,  CC 
signals  CM.  DM  will  then  perform  Cluster  Search  and  signal 
CC  tc  release  the  locks  on  the  descriptor-id  groups.  Next, 
DM  will  send  the  cluster  ids  for  the  retrieval  to  CC.  CC 
locks  cluster-ids,  since  a  new  address  may  be  specified  ror 
an  existing  cluster.  Once  the  cluster  ids  are  locked,  and 
tne  reguest  can  proceed  with  Address  Generation  and  tne  rest 
cf  the  request  execution,  CC  signals  DM.  DM  will  then 
perform  Address  Generation  and  send  the  retrreve  recuest  and 
tne  addresses  to  RECP.  Cnee  the  retrieval  has  executed  prep- 
erly,  RECP  will  tell  CC  that  tne  request  is  done  and  the 


Iocks  cn  the  cluster  ids  can  be  released. 


The  retrieval 


results  are  aggregated  by  each  backend  and  forwarded  tc  CP. 
E?  completes  the  aggregation  alter  it  na's  received  tne 
partial  results  from  every  backend.  When  P?  is  dcr.e,  the 
final  results  will  be  sent  to  the  user. 


IV.  AN  APPLICATION  OF  THE  METHODOLOGIES  TO  MDBS 


In  the  previous  chapters  we  aiscussed  the  separate 
topics  ci  atthodolcwies  for  joitj  internal  ana  externa.. 
pericraar.ce  measurements  of  database  systems  ar.i  toe 
£ult  i-t  acx  e  r.d  Datarase  System  (MDBS).  Tns  chapter  presents 
the  application  cf  these  me the  Jalopies  to  MDBS.  Tne  initial 
discussicn  concerns  irodii ication  to  toe  IDES  software.  ie 
iiscuss  the  decisions  made  during  j.mp  le  men  t  a  t  ion  ,  modifica¬ 
tion  cf  the  user  interface  process,  tne  ta  oxen i  processes 
and  the  controller  processes,  and  tat  issues  resolved  during 
imple mentat ion .  The  next  discussion  centers  on  tne  aodiri- 
caticrs  cf  the  MDBS  test  environment,  wrier  includes  test 
environment  cnanyes  and  software  tools.  The  final  discussion 
identifies  measurement  programs  that  were  lapieiei.tel 
outside  cf  the  MDBS  environment. 

A.  TEE  BCD IFIC ATION  CF  THE  MDES  SOFTWARE 

In  this  section,  we  begin  by  presenting  tne  decisions 
made  concerning  tne  implementation  of  internal  ar.d  external 
performance  measurements  on  MDBS.  Next,  we  discuss  the  codi¬ 
fications  cf  the  user  interface  and  tne  individual  MDBS 
processes.  Ke  conclude  this  section  by  relating  issues  which 
are  resolved  during  the  i nplementation  of  the  performance 
measurement  methodology. 

1  ■  lap  leuio  ntaticr  Deci  sic  ns 

When  designing  and  specifying  internal  and  external 
perfcraar.ee  measurement  me  tncdciogies,  decisions  must  fe¬ 
male  as  tc  the  most  advantageous  positions  to  place  tne 
checkpoints,  data  collections  ana  lata  aggregations.  These 


decisicr.s  are  rased  cr  the  reed  to  minimize  system  overhead, 
and  tc  provide  the  appropriate  level  of  detail  of  the  test 
data  obtained.  Primitives  ar.d  data  structures  mast  re  devel¬ 
oped  which  will  allow  the  measurement  programs  to  retain 
extensible  and  which  are  ccmpatiole  witn  existing  system 
software.  A  user  interface  must  be  developed  which  is  easy 
to  use,  should  not  require  the  user  to  possess  any  special 
knowledge  of  the  interface  in  order  to  use  it  and  shoull 
maintain  data  in  machine  readable  form  which  will  ailcw  for 
future  exparsion  of  the  performance  measurement  system. 

The  following  implementation  decisions  are  within 
tne  bounds  cf  two  constraints  placed  upon  us  by  the  current 
implementation  of  d  D  E  £  along  with  two  constraints  we  place! 
on  ourselves.  The  first  constraint  concerns  the  virtual 
memory  available  to  the  processes  resident  on  the  backends. 
The  operating  system  on  the  EDP-1 1/44  allocates  a  virtual 
memory  cf  6a  Kbytes.  Each  of  the  AD  S3  backend  processes  must 
fit  into  a  virtual  memory  of  this  size.  The  additional  sort- 
ware  added  as  a  result  of  performance  measurement  has  to  be 
constructed  so  that  it  will  fit  in  a  the  very  limited  memory 
space  remaining  in  eacn  backend  process.  The  second 
constraint  concerns  the  initial  dD  35  design  requirements 
which  calltd  for  a  broadcast  bus  between  minicomputers. 
Currently  a  Parallel  Communications  Link  (PCL)  is  being 
employed  as  the  i n t e r-com p u t e t  message-massing  mechanism. 
Messages  passe  3  over  the  PCI  are  sequentially  transmitted 
from  the  sender  to  t3e  receiver.  Tins  difference  in  opera¬ 
tion  between  the  PCI  and  the  rroadcast  bus  must  be  taxer, 
into  account  in  cur  attempt  to  validate  the  claims  cf  MILS. 
Additional  gerfcrmarce  measurement  programs  must  alsc  be 
writter  to  measure  message-passing  times  on  the  PCL. 

Ihe  third  constraint,  i.e.,  minimizing  overhead, 
significantly  influences  our  performance  measurement  design. 
This  subject  will  be  discussed  ir.  the  following  paragraphs. 


1**«=;  riDdl  constraint  ceais  with  cur  iesice  to  rac  MI  5 £  ur.  iz- 
y  d  e  l  !  tut  a  e  w  p  e  r  r  c  r  m  dtc  e  2c5iurt,acii  t  s  o  r  t « u  it .  a  a  t  a  ji  ^ 
art  r.ct  evaluating  the  system,  we  want  to  Cc  a~ie  to  run 
MDP5  wit:.  no  overhead  incu  rred  ny  the  additional  programs 
and  checkpoints  oi  the  per  fori  a  nee  a«asarc2e:.t  systei.  This 
is  accompnis  ne  d  by  bracketing  dal  perron  man  ce  m  e  a  s  u  rtiti.t 
software  within  special  preprocessor  mstracti  one  whicr. 
allow  us  tc  include  cr  out  tne  performance  aeas ur ement 
software  during  program  compilation.  A  definition  file  is 
created  containing  flags  which  are  used  to  determine  tne 
secticns  of  performance  measurement  code  to  be  compiled.  By 
compiling  separate  versions,  we  tnen  have  the  capanility  of 
running  MIES  without  performance  measurement  overhead  or 
with  the  overhead  introduced  when  we  select  certain  portions 
cf  the  performance  measurement  software  ror  compilation. 

Communication  in  MDBS  is  accce:r  iis  ned  by  passing 
messages.  Processes  which  are  resident  in  the  same  miniccm- 
puter  cc  mm  u  r.ic  at  e  by  using  inter- process  messages,  while 
processes  resident  in  different  minicomputers  commur.ica  te  by 
using  inter-computer  messages.  Actions  taken  by  tne  various 
processes  ir.  MDBS  are  initiated  by  the  receipt  of  a  message. 
Actions  end  when  that  message  has  men  processed  and  any 
resultant  messages  have  been  sent.  As  a  message  is  received 
by  a  process,  the  action  taken  by  the  process  is  dependent 
cn  the  message  origination  and  type.  The  general  MISS 
process  procedure  hierarchy  is  shown  in  Figure  5.1. 

The  highest  level  of  this  process  is  the  main  proce¬ 
dure  .  This  procedure  receives  tne  next  message  and  based 
upon  the  originator  cf  the  message,  calls  a  sub  procedure  ir. 
tne  procedure  hierareny.  The  message  works  its  way  down  this 
tree  or  sur  procedures  based  upon  tne  originator  of  tne 
message  arc  tne  message  tyye.  Jitimareiy,  tne  message 
arrives  at  a  m essa j e- ban  11 i n j  proceu  ire  (message  handier). 
The  message  handler  has  tne  responsibility  of  processing  the 
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Figure  5.1  The  MDBS  Procedure  Hierarchy. 

aessace.  Ir  ioir.g  sc,  it  may  call  other  procedures  lower  i; 
the  hierarchy.  MDBS’s  message  oriented  approach  naturally 
lends  itself  to  checkpoint  placement  at  this  level. 
Selection  or  aeasureaent  at  this  level  provides  the  iser 
witn  sufficient  processing  detaixs  wnile  not  ov  er  t  ur  d  en  ir. . 
the  user  with  excessive  information.  A  range  or  six  tc 
twelve  checkpoints  may  be  installed  in  each  MDBS  process  at 
this  level.  The  general  approach  to  tie  installation  oi 
checkpoints  is  shown  in  figure  5.2.  In  this  installation,  we 
insert  checkpoints  teth  bercre  and  after  every  message 
handler.  As  a  result,  we  obtain  tne  time  cr  entry  into  tne 
procedure  and  exit  from  tne  procedure.  The  dirferer.ee; 
hetween  these  times  is  the  time  it  takes  to  process  the 


.Me asur 1 r 3  at  tms  level  presents  ore  procltt.  Ihe 
syst.cn  c  i  •„ K  s  are  not  s  u  ft  icier,  t  ly  refined  tor  the 
processiry  speed  of  the  message-hand!  in  3  routines,  ice  clock 
on  the  tL?- 11/44  measures  time  in  discrete  time  intervals  or 
cr.iy  cr.e  sixtieth  of  a  second.  Ice  clock,  on  the  Vax-IV^di 
measures  time  in  discrete  time  intervals  or  only  one 
hundredth  ox  a  second.  In  any  yiver.  time  interval,  the 
system  t ; me  may  oe  accessed  Ly  the  performance  aea^  x  r t me nt 


so::'. 


means  that  access  max  occur  exact*/  1  r.  /  r. 


time  ii.  rexval  is  recorded  iy  t  r.e  o vs  tern 


:k  or 
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between  the  record  in^,  of  a  tire  interval  by  the  system, 
clock.  Because  of  this  conditio;.,  t:.e  variance  of  t r.e  uk 
measure  mert  wouli  ft  approximately  twice  the  siaiiest 
inter  vai .  This  variance  as  significant  when  it  is  compared 
to  the  time  it  taxes  a  messa  ge-nan  dim g  routine  to  -recess  a 
message.  A  ret  hod  rust  be  developed  to  reduce  this  variance. 
The  solution  is  to  send  multiple  requests  to  the  messa-e— 
hand  lire  routine  being  timed,  to  record  the  tire  for  each 
request  and  then  tc  compute  the  average  of  the  recent  i 
times,  thereby  obtaining  a  mere  accurate  measurement  cn  the 
true  processing  time. 

In  order  to  keep:  overhead  to  a  minimum  and  tc  xeeg 
tne  p  error  mane  e  measurement  system  exter.sirle  arm  simple, 
we  decide  tc  place  ainrmai  performance  measurement  software 
in  ar  .ICES  process.  Nc  processing  of  test  data  is  dcre  in  an 
a DBS  process.  All  test  data  is  sent  to  the  test-interr ace 
routines  for  aggregation  anu  storage.  Since  dCES  is  a 
iicssacfc-tased  system,  measurement  control  messages  and  test 
data  are  transferred  as  messages  utilizing  existing  ''ZEE 
communications  routines.  A  differently— oriented  system,  sue.-, 
as  procedure— oriented,  would  reguire  a  different  approacn  to 
measurement  software  communication. 

The  installation  of  the  checkpoints  regimes  that  i 
metncd  re  devise  j  tc  collect  the  information  obtained  by  the 
checkpoints.  The  information  could  re  stored  locally,  trans¬ 
ferred  tc  a  central  storage  location  in  the  minicomputer  or 
sent  to  the  test  interface  for  storage.  In  order  tc  reduce- 
the  system  overhead  introduced  iy  message  passing  we  deter¬ 
mine  thit  the  temporary  .ocal  storage  of  cat  a  would  re  most 
efficient.  As  pointed  out  previously,  one  of  tne  constraints 
placed  jtor.  tne  implementation  of  performance  measurement  is 
tne  virtual  memory  space  available  at  the-  Lac  ken  is.  itcrapj 
ci  the  test  data  generated  rrem  tne  checkpoints  would  have 
to  re  large  enough  tc  contain  sufficient  timing  information 


ar.d  small  enough  tc  reside  in  tie  constrained  virtual, 
memory  ujact  available  to  a  backend  process.  For  cur  timing 
meas  uremer.t  s,  the  upper  bound  c;i  tne  number  of  requests  Sent 
to  a  message  handler  at  one  time  is  fixed  at  one  hundred.  In 
ctaez  words,  we  assure  tna t  teasurin  3  a  given  function  mere 
than  10)  tires  ml  rot  provide  a  statistically  significant 
difrerence  over  measuring  ti.at  same  function  exactly  103 
times.  Giver,  this  upper  bound,  we  decide  that  a  static  array 
of  2GC  records  would  be  small  enough  to  fit  in  the  virtual 
memory  of  a  backend  process,  yet  large  enough  tc  held  a 
sufficient  amount  of  test  data.  Figure  5.3  shows  the  general 
approach  to  the  placement  of  the  performance  measurement 
routine  (Timer)  which  is  called  by  the  checkpoint,  accesses 
the  system  clock  and  manages  the  static  array. 

Another  Question  that  rust  be  answered  is  the  manner 
in  which  the  checkpoints  are  activated.  Should  we  activate 
cniy  cne  checkpoint  at  a  time  or  multiple  checkpoints  at 
cr.ce  r  We  determine  that  activating  more  than  one  checkpoint 
at  a  time  could  introduce  error  into  tne  measurement.  If  cne 
routire  (A)  which  is  being  measured  called  another  rcutine 
(E)  which  is  also  being  measured,  tne  time  necessary  to  do 
timing  measurements  cr.  (3)  would  increase  the  total,  running 
time  of  (A) .  Because  of  this  we  only  aixow  tne  measurement 
of  cne  routine  at  a  time. 

The  desire  tc  provide  a  user  interface  which  is  easy 
to  use  and  requires  rc  particular  knowledge  of  test  inter¬ 
face  implementation  leads  us  to  develop  a  menu— driver 
system.  The  modularity  of  the  performance  measurement  design 
lend  s  itself  to  easy  acce  ss  via  menus.  The  menu- driver 
system  is  also  compatible  with  the  existing  test  interface 
sys  t  e  tr . 

The  final  problem  is  hew  to  process  and  stcre  the 
raw  test  data  received  f rem  tne  varioas  trocesses.  We 
teguire  that  the  user  have  access  to  both  raw  data  and 
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Figure  5.u  The  Procedure  Hierarchy 
with  Checkpoints  and  Timer. 


summarized  mioriaticr.  Also,  we  require  that  the  data  he 
available  for  further  machine  processing.  These  problems  art 
eliminated  ty  maintaining  all  collected  data  in  files.  rt’her. 
raw  data  is  received  from  a  process  it  is  immediately  stored 
in  a  file.  Once  ali  requests  which  are  to  be  timed  have 
finished,  the  file  containing  the  raw  data  is  accessed  and 
processed  to  produce  another  file  containing  summarized 
information  on  the  various  message-handling  routines  whicn 
have  been  measured.  A  history  of  this  information  is 


compiled  as  the  under  lying  operating  system  (  on  the  miri- 
conriter  where  t  ne  controller  software  resides  )  creates  new 
ver s i cr. s  of  toes e  files  each  time  the  measurement  programs 
are  invoked. 

2.  Ike  Modi  fica  tion  s  or  the  User  In  tt  r  f  ace 

trier  to  the  implementation  of  tne  performance  meas¬ 
urement  methodologies,  tne  test  loterrace  process  consisted 
ci  these  programs  necessary  tc  generate  a  test  database, 
load  a  test  database  and  execute  reguests  against  the  test 
database.  Toe  i  asp  le  aenta  tic  n  ci  performance  measurement 
software  within  tne  existing  software  structure  of  tne  test 
interface  is  accomplisned  by  expanding  tne  existing  hier¬ 
archy  cf  control  and  by  integrating  performance  measurement 
software  with  existing  test  interface  software.  Figure  5.4 
shows  the  test  interface  procedural  hierarchy  with  the 
performance  measurement  modifications. 

The  user  selects  actions  to  be  performed  by  trav¬ 
ersing  a  tree.  At  each  node,  a  decision  is  made  as  tc  the 
path  tc  fellow.  9y  following  certain  paths,  the  user  nas  the 
capability  to  generate  a  database,  load  a  database  or 
execute  the  test  interface.  When  the  user  decides  to  execute 
the  test  interface,  a  decision  is  then  made  as  to  what  path 
to  follow  cn  the  test  interface  sub-tree.  The  user  may 
cnoose  a  new  database  to  work  with,  create  a  new  list  of 
traffic  units,  modify  an  existing  list  of  traffic  units, 
select  traffic  units  from  an  existing  list  for  execution, 
select  an  existing  list  so  that  ail  traffic  units  cn  the 
list  may  be  executed,  display  the  results  of  external  meas¬ 
urement  cr  perform  a  combination  of  internal  and  external 
per  f  crmacce  measurement.  The  user  may  traverse  the  tree  at 
will  moving  up  and  dewn  the  tranches  to  accomplish  a  wide 
variety  cf  tasks. 
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The  MDBS  software  in  the  test  interface  contains  a 
procedure,  called  by  the  other  MDBS  procedures,  to  execute  a 
request  (transaction).  That  as,  tc  forward  a  request  (trans¬ 
action)  to  F.eguest  Preparation  for  processing.  This  proce¬ 
dure  is  selected  fcr  tne  placement  of  the  external  and 
internal  performance  measuring  software  necessary  tc  time 
and  manipulate  reguests.  External  measurements  are  taken 
from  this  procedure  immediately  before  the  reguest  is  sent 
to  Becuest  Preparation  for  processing  and  after  the  results 
are  returned  from  Post  Processing.  Software  is  added  tc  this 
procedure  tc  generate  requests  to  the  MD3S  processes  which 
initialize  the  message-handling  routines  for  internal 
performance  measurement,  generate  multiple,  identical 
requests  in  order  tc  reduce  the  timing  variance  (as  previ¬ 
ously  discussed)  and  to  generate  the  test  data  collection 
message.  The  number  of  multiple  reguests  to  generate  is 
provided  tc  this  routine  by  a  variable  defined  at  compile 
time.  Ibis  procedure  receives  the  information  necessary  to 
accomplish  these  other  tasks  by  sharing  a  first— in— ias  t-cut  •, 

stack  and  a  pointer  to  the  top  of  the  stack  with  the 
performance  measurement  software.  The  evaluator  interacts 
with  the  performance  measurement  software  to  nuiid  a  stack 
of  internal  performance  measurement  requests.  This  procedure  • 

then  draws  from  that  stack,  initializes  the  message- 
handling  routine  selected  by  the  evaluatcr,  generates 

* 

multiple  copies  of  the  MDBS  reguest  selected  by  the  evalu¬ 
ator,  and  generates  the  reguest  necessary  to  collect  tne 
test  data  from  the  process  which  contains  the  message- 
handler  being  evaluated.  Figure  5.5  snows  the  relationship  1 

bttween  this  procedure  and  the  performance  measurement  sort-  -j 

ware  and  its  data  structures.  • 

In  addition  to  external  system  timing,  ether 
performance  measurement  functions  provided  by  the  new  soft¬ 
ware  include  routines  which  allow  the  user  to  1)  select 
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introduced  into  existing  test  interface  software  to  aid  in. 
tie  message-passin g  require  xer.ts  of  the  pe rf  ormar.ee  measure- 
1  e  n  t  s  \  s  t  e  m  . 

2  .  Z  he  Modi  fica  t  ion  of  Indi  vidua  1  Processes 

The  PCL  processes  within  MDE5  are  modified  to  rass 
ferfcnance  measurement  messages.  A..1  of  the  remaining  ML 3 3 
processes  received  identical  modification.  The  sena/receive 
portion  cf  ever  y  process  is  modified  to  include  t  r.e  Cara- 
tility  cf  processirg  performance  measurement  messages. 
Send/receive  is  used  for  inter-process  message  passing. 
Checkpoints  are  placed  in  the  3D3S  processes  at  the  message— 
handling  (lew)  routine  level.  A  timer  routine  is  placed  in 
each  process  which  receives  control  messages  from  the  test 
interface.  An  ini tializa tion  message  causes  the  txmer 
routine  to  initialize  the  data  collection  array  to  zero  and 
turn  on  a  selected  checkpoint.  As  MDBS-generated  messages 
pass  through  a  checkpoint,  the  timer  routine  is  called.  The 
timer  routine  accesses  the  system  clock  and  stores  the 
message  type  and  time  in  an  array.  A  completion  message  from 
the  test  interface  causes  the  routine  to  transmit  the  data 
collected  in  tne  array  to  the  test  interface  and  tc  turn  eff 
the  checkpoint  which  is  timed.  Figure  5.6  snows  the  modifi¬ 
cations  made  to  the  directory  management  process  as  an 
example  cf  the  implementation  of  the  general  modifications, 
shown  in  figure  5.3. 

4 .  Iss  ues  hesolved  During  the  Implementation 

f* C 2 i  is  an  experimental  database  machine.  As  such, 
it  is  under  constant  modification  and  subject  to  use  by  many 
system  developers.  The  MDBS  software  engineering  environment 
requires  that  versions  be  used  to  control  program  modifica¬ 
tion,  but  it  is  impractical  tc  create  new  versions  cf  MC3S 
every  time  a  single  program  is  modified.  One  solution  we 
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an  entry  in  the  in— use  file  which  indicates  who  is  currently 
modifying  that  -articular  program.  Inis  net  nod  allowed  the 
developer  tc  modny  a  fro^raa,  compile  and  test  the  xciiri— 
caticr.  ir.  an  environment  away  iron  the  main  MISS  environment 
(  in  order  tc  avoid  cc  11  promising  a  runcticnai  system  )  and 
return  the  program  tc  MDE5  upon  completion,  or  testing.  Tr.is 
method  avoids  the  possibility  of  two  developers  concurrently 
modifying  the  same  program  and  the  ensuing  proolems.  Machine 
time  is  also  at  a  premium.  There  is  no  easy  solution  ror 
this.  Much  of  the  measurement  must  be  conducted  during  non— 
peak  hours  such  as  late  evening  and  weekends.  This  is  neces¬ 
sitated  by  the  requirement  that  tne  measurement  cf  .IDES  be 
accomplished  in  a  stand-alone  environment.  Since  the  MDBS 
controller  is  implemented  on  a  time— sharing  system,  the 
entire  machine  has  tc  he  reserved  for  performance  testing  so 
that  MESS  could  be  rur  in  isolation. 

The  performance  measurement  system  places  additional 
demands  on  MDBS  system  message-passing  software.  Except  for 
one  case,  the  system  responds  without  protest  to  this  unex¬ 
pected  lead.  Tne  message-p recessing  routines  of  the  MESS 
hackends  are  not  designed  to  handle  the  transfer  of  200 
internal  performance-measurement  messages  from  a  backend 
process  to  the  controller.  There  is  not  sufficient  space 
available  tc  store  the  pointers  required  to  access  this  many 
messages.  The  MDBS  programs  are  easily  extended  tc  account 
for  this  change  in  message  traffic. 

The  MDBS  controller  resides  on  a  VAX-1  1/7E3  wnic.n 
operates  under  a  time-sharing  mode.  When  inter-computer 
messages  are  passed  or  the  PCL,  tne  operating  system  expects 
a  confirmation  within  a  certain  time  interval.  hah  u> 
problems  occur  during  tne  noraa.  operation  of  M.Dff,  r  re- 
large  message  traffic  from  tne  -acxenas  to  tr.tr.  control  !•  r 
during  internal  measurement  require  more  1 1  a.  *.  t  .1  r  t ;.  a  ♦ 
ailoted  tc  the  ccrtrcxiec  Jurxag  its  .  j.t »  11 


VAX- 1  1/7c3.  The  result  is  tnat  tr.e  controller  processes  o;. 
t;.e  VAX  are  suspended  while  the  cackend  is  stilx  trai.sm  itir.  g 
over  t h e  PCI.  when  tie  ?CL  receives  no  response  it  signals 
an  error  and  alorts.  Ofc vio  usl y ,  this  is  not  a  pr  cl  ie  z  wren 
the  "EES  system  runs  stand  alone.  However,  such  aiertior. 
does  previdt  more  than  an  inconvenience  during  tne  lmpiener.- 
taticn  of  the  performance  measurement  system. 

Currently,  MEE5  utilizes  two  different  ty.e  cf  mini¬ 
computers.  This  translates  into  two  different  operating 
systems,  two  different  text  editors,  two  different  compilers 
and  two  different  system  clocks  wmen  recorl  times  in 
differing  units.  Because  of  this,  performance  programs  in 
the  controller  processes  and  tne  bacicend  processes  are  ret 
identical.  Different  access  mechanisms  for  system  timers 
must  te  developed  and  a  routine  must  be  aeveioped  tc  convert 
tne  times  received  from  the  tacxends  into  the  eguivaient 
time  units  cf  the  controller.  Additional  time  and  effort 
are  required  to  become  sufficiently  knowledgeable  or.  tne  two 
systems  in  order  to  begin  implementation  of  tne  performance 
measurement  methodology. 

E.  TEE  MODIFICATION  CF  THE  MDES  TEST  ENVIRONMENT 

In  conducting  performance  measurements,  one  demands  tr.at 
all  the  measurements  he  consistent  as  well  as  reproducible. 
There  should  be  no  inconsistent,  unexplainable  results. 
Further,  the  results  should  te  reproducible  with  re-runs. 
This  section  discusses  the  necessary  changes  in  the  test 
environment  to  insure  consistency  and  r  eproducib  lit  y .  Tier, 
we  present  the  software  tools  used  to  make  tne  testing 
easier  and  smoother. 


1.  Charges  to  the  last  Envi consent 

"he  me  t  a  cdol  c  Sies  tor  mttrtdi  and  external  perrorm- 
ar.cc  measurements  or.  a  database  system  have  ore  preregui- 
site.  The  results  mtst  rot  re  acciuentai.  Iaese  results 
reed  t o  he  consistent  ar.u  r eprcuucible.  lo  achieve  ccr.eis- 
tercy  arc  repro ducitlity,  we  must  he  able  to  control  tne 
test  e  r  v  ir  o  r.me  n  t .  Every  scientific  experiment  requires  the 
test  environment  to  he  controlled,  to  insure  that  all 
factors  ertfcCting  the  experiment  are  known. 

The  experimental  MDBS,  the  system  to  he  tested,  has 
its  controller  processes  running  under  a  VAX/VMS  environ¬ 
ment.  Thus  requires  these  processes  of  the  controller  to  be 
run  simultaneously  with  the  other  non-system  processes  in  a 
timesharing  environment.  Under  tais  environment,  the 
results  obtained  would  he  erratic  and  inconsistent.  To 
alleviate  this,  several  preliminary  steps  are  taxen  prior  to 
frnai  testing.  Tne  tests  are  run  stand-alone  with  ail  ether 
logins  to  the  computer  disarled.  Ai^  processes  are  giver, 
tne  highest  possible  real-time  priority.  Swapping  cut  of 
processes  in  the  wait  state  is  disabled  to  retain  the 
processes  ir.  the  physical  memory.  Pa^e  faults  are  disabled 
by  increasing  tne  working  set  size  to  the  size  of  the  image 
cf  each  process.  In  this  way,  the  VAX/VMS  system  appears  to 
tne  evaluator  as  a  single  user  system. 

2  •  -of  hf^re  Tools  for  the  Test  Environment 

An  evaluator  should  understand  the  system  tc  be 
tested,  determine  the  various  parameters  to  ce  altered, 
specify  the  various  data  to  re  co.i.iected,  and  interpret  the 
results.  Tedious  and  busy  work,  such  as  modifying  the  input 
set  cr  the  system  configuration,  can  re  done  manually  and 
are  time-ccnsum  ing  without  proper  tools.  Ne  v  e  r  t  ae  less, 
these  modif icaticns  are  necessary,  and  can  he  automated  by 
using  software  tools. 
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In  [Ref.  9]/  software  has  been  provided  tc  generate 
a  database  ar.d  a  request  set,  lead  tne  database,  and  run  the 
request  set  against  the  database  whicn  can  all  re  used  in  I 

the  test  in  q  of  MDBS.  This  allows  for  easy  creation  ar.  i 
codification  of  the  selected  database  and  re-nests.  The 
system  software  needs  to  me  codified  durinq  tne  testing  to 
accommodate  such  things  as  changes  in  tre  number  of  backends  ft 

being  used  by  the  systea  and  whether  or  not  internal  testing 
is  tc  be  performed.  Bach  change  requires  a  r  ecoar  iiaticr.  of 
tne  system  software.  To  facilitate  ta  is  change  and  to 
insure  only  recompilation  oi  necessary  files,  the  Unix  ( 

•maxe*  command  is  used.  3riefiy,  execution  of  this  command 
would  check  a  file  created  by  the  autnor.  This  file  would 
indicate  all  i nterde pendenc ies  of  all  files  of  MDBS.  Ir  a 
file  has  teen  changed,  all  ether  files  effected  by  mis  ft 

change  will  automatically  be  recompiled  ar.d  relinked  upon 
executicr  cf  the  'make'  command.  In  tnis  manner,  the  system 
could  be  reconfigured  wit.,  ease  and  with  tne  assurance  tr.at 
ail  effected  files  are  changed.  Using  these  software  tccis  |> 

for  the  test  environment  and  with  proper  control  of  tne  test 
environment,  tne  tests  are  made  easier  to  conduct  and 
control  and  are  knowr  to  be  consistent  and  reproducible. 

C.  AEDITIC  NAL  MEASUREMENT  SOFTWARE  REQUIREMENTS 

In  erdtr  tc  completely  evaluate  MDBS,  the  message 
passing  mechanisms  must  be  monitored  to  determine  tne  time 
required  tc  pass  betn  i rter— compu ter  and  inter-prccess 
messages.  Although  the  measurement  of  these  messages  could 
cccur  during  the  execution  of  dDBS,  the  environment  under 
which  the  messages  are  passed  could  be  more  easily 
controlled  if  the  messages  are  evaluated  outside  cf  the  d  133 
environment.  The  results  of  these  measurements  are  contained 
in  the  next  Chapter  VI. 
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1 .  Inter-commuter  Mess  age  Pucesou  ;  Measurement 

^ew  software  does  not  have  to  ft  envelope?  to 
measure  the  time  required  to  pass  messages  or.  the  PCI. 
Programs  are  provided  ty  the  manufacturer  of  tne  PCI  wrier 
measure  the  message-passing  time.  The  evaluator  is  giver  the 
capability  cf  specifying  which  cede  on  tne  PCL  is  to  receive 
the  message,  the  message  length  and  tne  number  of  messages 
to  send.  Ihe  software  generates  and  sends  the  messages,  then 
provides  the  total  time  to  transmit  the  messages  to  tne 
evaluator.  Ihe  PCI  is  implemented  as  a  ring  bus.  Because  of 
this  style  of  implementation,  we  decide  to  send  messages 
from  one  selected  node  to  itself.  The  times  obtained  are  an 
upper  bound  to  the  i r ter-co m pu te r  message  passing  time. 

2 •  Inter-prccess  Message  Process inq  Measurement 

Ercgrams  are  written  for  the  inter-process  message 
processing  measurement.  To  determine  the  time  required  to 
pass  a  message,  we  developed  two  programs.  The  first  program 
gets  the  time,  generates  a  selected  number  of  messages  with 
a  selected  message  length,  and  sends  them  to  a  second 
program  which  receives  the  messages  and  tr.en  gets  the  time. 
We  run  the  first  program  at  a  higaer  system  priority  than 
the  second  to  prevent  tne  system  from  process  switching 
before  all  the  messages  nave  teen  generated.  After  genera¬ 
tion  cf  ail  messages  ty  the  first  program,  we  tnen  set  the 
system  priority  of  the  sending  program  heiow  that  cf  the 
receiving  program,  thereby  forcing  a  process  switch.  We  can 
then  compute  the  average  time  it  taxes  to  pass  a  single 
message  cn  the  machine.  To  obtain  a  nigner  degree  cf  accu¬ 
racy  we  must  account  for  the  time  it  takes  the  system  to 
switch  processes  and  the  time  it  takes  tne  system  to  alter 
the  priority  of  the  sending  process.  Programs  are  written  to 
account  for  these  times.  The  program  written  to  account  for 
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V.  TEE  BENCHMARK  OF  MDBS 

lie  construction  cf  the  test  database  and  tee  selection 
cf  requests  are  very  important  in  tne  performance  measure— 
lent  cf  a  test  database  system.  The  test  database  should  be 
representative  of  a  real  database,  but,  as  presented  in 
[Ref.  7],  the  test  database  sneuid  be  modeled  independent  of 
any  specific  database.  Both  the  Test  database  an!  the 
requests  selected  shculd  be  property  modeled  to  ailew  for  a 
complete  exercise  cf  the  target  system.  At  the  same  t^me, 
parameters  must  not  be  selected  randomly,  but  rather  should 
be  created  to  provide  the  evaluator  flexibility  and  ease  of 
evaluation.  In  this  chapter,  we  first  describe  tne  manner 
in  whicn  tne  test  database  is  modeled.  We  then  describe  tne 
request  set  which  is  used  in  the  ^erxormance  measurement 
experiments. 

A.  TEE  SELECTED  DATAEASE 

Since  MDBS  is  ac  experimental  database  system,  it  is 
constantly  being  improved  and  enhanced.  For  this  reasen, 
tne  test  database  is  designed  to  facilitate  measurements  by 
being  easily  expandable.  A  distinction  will  be  made  ir.  the 
following  discussions  between  the  design  of  the  test  data¬ 
base,  which  allows  for  future  measurements,  and  the  actual 
implenertation  of  the  test  database  used  in  the  measurement 
experiments . 

1  .  The  Design  of  the  Model  Pat anase 

Several  factors  must  be  considered  in  the  design  of 
a  model  database.  Since  the  system  being  measured  can  be 
configured  with  either  one  or  two  bacxends. 


the 


•  work’ 


required  tc  process  a  request  has  to  ue  evenly  divisible  to 
accom ic d a te  the  use  cf  either  one  or  two  backends.  Ine 
types  ci  work  involved  are  attribute  search,  descriptor 
search,  cluster  search,  address  generation  and  the  retrieval 
and  selection  of  records. 

Table  I  displays  the  three  configurations  to  be  used 
in  the  terf crmance  measurement  of  MDBS.  The  conri gur a ti cns 
have  been  selected  tc  simplify  the  verification  of  the  MDBS 
performance  and  capacity  claims.  These  claims  are  to  1) 
halve  tr.e  response  time  by  doubling  the  numrer  of  tackcrds 
and  keeping  the  size  cf  tne  database  constant  and  2)  main¬ 
tain  a  constant  response  time  by  doubling  both  the  number  of 
backends  and  the  size  cf  the  database.  As  shown  in  Table  I, 
going  from  Test  A  tc  Test  E  maintains  a  constant  database 
size  but  allows  the  database  to  be  evenly  split  between  two 
backends.  Conversely,  going  from  Test  3  to  Test  C  doubles 
the  size  of  the  database  at  each  backend. 


TABLE  I 

The  Benchmark  Configuration 


TestJ 

Nc.  of  Backends 

Mby te/fcackena 

A 

1 

n 

L 

3 

2 

1.  5  n 

C 

2 

n 

Tc  properly  evaiuate  a  dataoase  system,  varicus 
record  ~izer  need  to  te  used.  The  sizes  are  chosen  based  on 
t  •  i  z  *  ~  i  tne  unit  cf  disk  management.  In  MDBS,  this  is 
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the  logical  track,  cr  clock.  M  DBS  processes  information 
from  tne  secondary  memory  using  a  4Kbyte  logical  track. 
Given  a  fiocksize  of  4Kbytes,  we  recommenc  ccr.structirg  the 
database  with  record  sizes  of  200  bytes,  400  bytes,  1C00 
lytes,  and  2000  bytes  £Bef.  7].  This  gives  a  range  of  2  to 
20  records  per  block.  This  also  creates  an  environment 
where  four  separate  databases,  corresponding  to  the  fcur 
record  sizes,  must  be  generated  and  tasted  for  each  configu¬ 
ration  given  in  Table  I.  Table  II  gives  the  corr espcndin g 
relationship  between  records  and  blocks. 
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TABLE  II 

The  Reccrd-and-Slcck  Belationship 


Eecord 

i 

Records 

- ~\ 

1 

Size 

in  Bytes 

1 

1 

per 

E-lOCK 

1 

| 

200 

1 

_J _ 

20 

1 

1 

400 

4— 

10 

1000 

1 

-+— 

1 

4 

2000 

k. 

As  described  in  Chapter  III,  the  target  system 
stores  records  in  clusters.  Five  cluster  categories  have 
been  selected  for  use  in  the  creation  of  the  model  database. 
The  distinguishing  characteristic  of  a  ciuster  category  is 
the  number  of  blocks  used  to  store  the  records  in  the 
particular  category.  Table  III  outlines  tne  sizes  of  c  aci. 
of  the  five  cluster  categories.  Cr;e  final  note,  the  number 
of  blocks  per  cluster  must  be  even.  T.vus,  when  the  number 
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cf  backends  is  increased  from  cne  to  two  with  tae  number  of 
records  in  the  database  remaining  constant,  we  are  guaran¬ 
teed  that  each  backer.d  will  have  the  same  number  of  blocks 
per  cluster.  For  example,  when  the  cluster  category  is 
small,  each  backend  would  have  one  blocic  for  the  particular 
cluster,  insuring  an  even  distribution  of  the  database. 

1  ~  1 


TAELE  III 

The  Cluster  Arrangement 


jclusters  Cate  gory |  Biocks/C  luster  j 

small 

1 

2 

sma 11-medi um 

4 

medium 

6 

medium-lar  ge 

8 

large 

10 

_ J 

Combining  the  data  in  Tables  II  and  III,  we  can 
construct  a  matrix  cf  data  which  represents  the  number  of 
records  per  cluster  category.  Table  IV,  indexed  using  the 
cluster  category  and  the  record  size,  details  this  informa¬ 
tion.  The  number  ci  records  per  cluster  is  obtained  by 
multiplying  the  Reco ids/3io ck  results  from  Table  II  by  the 
corresponding  Blocks/Cluster  results  from  Table  III. 

The  remaining  considerations  when  developing  a  test 
database  involve  the  specification  of  the  directory  s  tr  uc- 
ture  for  the  particular  record  type.  In  MDBS,  a  record 
template,  which  describes  the  record  structure  is  defined. 
The  record  template  defines  the  number  of  attributes  in  the 


TABLE  IV 

The  Records  per  Cluster  Category 


Number  of  Records  per  Cluster  j 

(Record 

I  Siz  e 

|  in  Bytes) 

(200  ) 

_ 

(400) 

(  1000) 

(  2  C  0  0  ) 

C  C 

I  A 

U  T 

S  E 

T  G 

E  C 

R  R 
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small 

40 

20 

3 

4 

. 

s  m  all-mediu  a 

80 

40 

.  ..  . 

1  0 

3 

medium 

120 

60 

24 

- 

12 

me  cium— large 

160 

30 

32 

16 

L 

large 

200 

100 

40 

20 

record,  and  for  each  attribute,  the  attribute  naire  arc  the 
attribute  type  (either  integer  or  string) .  Given  a  record 
template,  the  directory  and  non-directory  attributes  are 
specified.  For  each  directory  attribute,  a  descriptor  t^pe 
and  descriptor  ranges  are  defined  (see  Chapter  III). 

2 .  The  Implementation  of  th e  Model  Database 

This  section  examines  the  implementation  decisions 
made  when  specifying  the  test  database  and  the  testing 
strategy.  The  current  version  of  MDBS,  the  primary-memory- 
based  directory  management ,  stores  the  directory  tables, 
i.e.,  the  AT,  DL  IT ,  and  CDT ,  in  primary  memory.  Giver,  the 
primary  memory  limitations  of  the  backend,  we  are  forced  to 
limit  the  variables  mentioned  in  the  previous  section.  Cur 
first  decision  is  to  limit  the  size  of  the  test  database  to 
a  maxima  of  1000  records  per  backend.  Table  V  displays  the 
configurations  which  are  used  in  the  performance  measure¬ 
ments  of  MIES. 
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TABLE  V 

The  Measurement  Configurations 


TEST  | 

No.  of  Backends 

3 ec or ds/ Backend 

A  .  E  | 

1  I 

1000 

E  .  E 

2 

500 

C.E  | 

2 

100C 

A .  I  i 

1 

1000 

B.I  ] 

2 

500 

_ i 

liv e  different  system  conf igurations  are  needed  for 
the  MDBS  performance  measurements.  Tests  A.  E,  B.E,  anc  C.E 
are  conducted  without  internal  performance  software  an 
place.  Test  A.  E  configures  MDBS  with  one  hacicend  arc  one 
thousand  records  in  the  test  database.  Test  B. E  configures 
MDBS  with  tko  backends  and  one  thousand  records  split  evenly 
between  the  backends.  Test  C.  E  also  configures  MDBS  with 
two  backends,  but,  the  size  of  the  database  is  doubled  to 
two  thousand  records.  Test  A. I  and  3.1  are  conducted  with 
internal  performance  software  in  place.  Test  A. I  configures 
MDBS  kith  one  backend  and  one  thousand  records  in  the  test 
database.  Test  B.I  configures  MDBS  with  two  backends  and 
one  thousand  records  split  evenly  Detween  the  backends. 
'Using  these  five  configurations,  the  verification  of  the 
MDBS  performance  and  capacity  claims  is  simplified  anc  tne 
performance  measurement  methodology  or  computir.j  the 
interrai  measurement  cveraead  is  racilitated. 

Cur  second  decision  fixes  tne  recora  size  at  2  0  D 
hytes.  The  200  byte  record  minimizes  the  primary  memory 
regained  to  stone  the  record  template.  In  actuality,  a 
record  of  198  bytes  is  used.  The  record  consists  cf  33 
attributes,  each  reguiring  t  bytes  of  storage.  The  record 


template  rile  used  in  our  experiments  is  shown  ir.  Figure 
6.1.  Ct  the  33  attributes  listed,  INT21  and  INIZ2  are 
directory  attributes.  :i'JLU  and  STROO  to  STP.23  are  nen- 
directcry  attributes. 

In  cur  next  decision,  the  descriptor  types  anc  the 
descriptor  ranges  for  the  two  directory  attributes,  IN1F1 
and  1NTE2,  are  defined  in  the  descriptor  files  (see  Figure 
6.2)  .  Ihe  values  for  INTZ1  are  classified  by  using  five 
type-A  descriptors,  each  of  wnich  represents  a  range  cf  230. 
Ihe  values  for  IK1E2  are  also  classified  using  type— A 
descriptors.  The  first  twenty— three  ranges  for  INTF2  cover 
40  values,  with  the  last  range  covering  80  values.  Ihe 
non- uni f or mity  of  the  IMTZ2  descriptor  ranges  is  caused  by  a 
size  constraint  m  the  Concurrency  Control  process. 


Attribute  Marne 

Attribute  Type 

INTF1 

integer 

INTE2 

integer 

M  01 31 

string 

SIFC0 

string 

STE01 

m 

string 

• 

• 

STF29 

• 

string 

Figure  6.1  The  Hecord  Template  File. 


Ey  utilizing  the  attribute  and  descriptor  files,  the 
record  file  is  generated.  INTE1  and  INTE2  are  identical, 
being  the  next  sequential  number  after  the  previous  record, 
starting  at  1.  Tnerefore,  the  one  thousandth  record  would 
have  the  (INTE1,  IMZ2)  pair  set  to  1000.  Ihe  .'*0111 
attribute,  which  is  cf  type  character  string,  is  set  tc  Coe 
for  a  database  of  orly  1000  records.  The  intent  of  this 
attribute 


is  tc  increase  the  rumher  cr  records  fti  cluster  in  tne 
catarase .  This  is  dene  by  setting  ULTI  to  Two,  Three, 
etc.,  for  each  (IME1,  INTE2)  rair  in  t ;ic  database. 
Therefore,  tc  double  the  size  cr  the  database,  every  (INTZ1, 
I ;i T E 2 )  pair  will  have  an  associated  hULTI  attribute  witr. 
values  of  Cne  and  Twc.  The  remaining  attributes,  ST5C0  to 
STR  2  9,  are  set  tc  Xxxxx  as  fillers.  Figure  6.3  depicts  the 
general  layout  cr  tte  record  file  for  1003  records  where 
COL II  is  set  to  Cne. 

Given  the  structure  aescnced,  our  last  decision  is 
irade  for  us.  The  specif i cat  ion  of  24  descriptors  fer  the 
Ib'TSw  attribute,  coupled  with  tne  record  file  structure, 
generates  a  database  that  contains  24  clusters.  The  first 
23  clusters  correspond  to  the  small  Cx  uster  category,  and 
each  contains  40  records.  The  last  cluster  corresponds  to 
the  s  nall-iredi  u  m  cluster  category  and  contains  80  records. 
To  aairtair  consistercy  in  the  retrieval  requests  (discussed 
in  the  next  section),  we  avoid  any  requests  that  access  the 
last  80  records  ir.  tne  test  database  using  the  INTE2 
attribute. 

E.  TEE  FEQ0ES1  SET 

The  recues  t  set  used  for  cur  performance  measure  ire  rt  is 
given  in  Figure  6.4.  Tne  retrievals  are  a  mix  of  single  or 
double  predicate  reguests.  Since  the  majority  or  the  work 
done  on  a  database  is  to  retrieve  data,  we  limit  the  meas¬ 
urements  to  only  retrieve  requests.  In  every  request,  1/2 
of  the  target  attribute  values  for  each  record  is  returned. 
The  first  retrieve  is  a  request  for  only  two  records  from 
two  separate  clusters.  The  second  reguest  retrieves  1/4  of 
the  database.  Seven  of  the  24  clusters  must  be  examined. 
Ail  records  in  each  cf  the  first  six  clusters  are  retrieved. 
Cnxy  1/4  cf  the  seventh  cluster,  defined  by  the  range  from 
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Be-uest  Number 

Retrieval  request  | 

1 

{IN  151  =  1 0)  or  (IN  151=230)  | 

2 

(INTE2  <  250)  ] 

3 

( I NT 52  <  500)  | 

4 

(I  NT  51  <  10  00)  | 

5 

(IN  TE1  <200)  or  ( IN  IE  1  >30  1 )  | 

6 

(INI  El  <400)  or  (INIE1  >6  0  1 )  | 

7 

( I  NT  E  1  <  20  1)  | 

8 

(I  NT  El  <  40  1)  | 

9 

(INTE  1  <20  1)  or  (INTE1>800)  j 

|  The  Target  Attribu te-Values  for  Each:  ] 

|  (3  NT  El  ,  INTE2  ,M  UL  T I  ,STH0  0  ,  S I RQ  1 ,  ST  BO  2 ,  S  IB  0  3,  SIR  04,  | 
SIHC5,SIR06,STRC7,5TF.08,STB09,STR10,STfi11,SIF.  12)  ] 

Figure  6.4  The  Betrieval  Bequests. 

241  tc  230,  is  retrieved.  In  the  third  request,  1/2  cr  the 
database  is  retrieved.  Thirteen  of  the  24  clusters  must  be 
examined.  All  records  in  each  of  the  first  twelve  clusters 
are  returned.  Only  1/2  of  the  thirteenth  cluster,  defined 
ty  the  rarce  from  461  to  520,  is  retrieved.  The  system 
searches  only  f cr  records  having  values  in  the  range  from 
481  tc  5C0  in  this  cluster. 

The  entire  database  is  examined  in  the  fourth  reguest. 
The  firth  request  retrieves  2/5  of  the  database.  The  query 
is  divided  into  two  predicates,  to  obtain  all  records  from 
the  first  five  clusters,  and  the  last  four  clusters.  The 
sixth  request  is  a  retrieval  cf  4/5  of  the  datarase.  Again 
the  ^uery  is  divided  into  two  predicates,  to  ottair.  all 
records  from  the  first  10  clusters,  and  the  last  nine 
clusters. 


The  stvtntn  and  eighth  recjests  are  similar  in  intent. 
Tne  seventh  request  examines  1C  clusters,  repairing  crlg  1 
record  tc  re  retrieved  from  the  otn  Cxuster  and  r.eecing  all 
records  fret  tne  first  five  clusters.  The  eignth  reguest 
examines  15  clusters,  reguiring  only  1  record  tc  he 
retrieved  from  tne  11th  cluster  and  needing  ail  records  from 
tne  first  ten  clusters.  The  ninth  and  final  recuest  is 
similar  tc  the  fifth  reguest.  3ut  uniixe  tne  fifth  recuest, 
ten  additional  clusters  must  he  examined.  Cniy  tvic  cf  the 
records  kith  INTE1  values  of  201  and  801,  are  retrieved  from 
the  ten  additional  clusters.  All  records  m  the  remaining 
nine  clusters,  like  tte  fifth  reguest,  are  also  obtained  by 
this  retrieval.  Talle  VI,  a  presentation  of  the  number  of 
clusters  examined  versus  'tne  percent  or  the  database 
retrieved,  is  a  synopsis  of  the  previous  discussion  in 
tabular  fora. 

The  reguest  set  in  Figure  6.4  is  not  intenaec  tc  be 
representative  of  a  comprehensive  and  complete  recuest  set. 
The  gcai  is  not  to  exhaustively  measure  and  evaluate  VDES. 
Father,  ue  focus  or.  applying  the  performance  measurement 
metnccclcgy  to  MDBS  to  validate  the  basic  performance  and 
capacity  claims  ex  tie  system.  We  feel  that  these  reguests 
are  sufficient  for  such  a  validation.  Ke  will  refer  to 
tnese  rme  reguests,  i.e.  ,  retrievals,  ny  their  reccrd 
number  in  subsequent  discussicD. 


VI.  THE  TEST  RESULTS 

In  this  chapter,  we  present  the  results  obtained  :rot 
the  perfcr ranee  mea s treaen t  cf  MDBS.  MDBS  is  currently 
configured  with  the  primar  y—  memory— basea  directory  manage¬ 
ment.  Ir  this  versicr.  of  MDBS,  the  directory  tables,  i.e., 
the  AT,  EDIT,  and  CD1,  are  stored  in  tne  primary  memory.  Re 
expect  tc  achieve  different  results  wner.  version  r,  tne 
secondary— nemory-basec  directory  management  is  implemented. 
Iii  e  test  interface  is  utilized  to  send  the  retrieval 
requests  discussed  in  the  previous  cnapter  to  MDBS  for 
processing.  Each  request  is  sent  a  total,  of  ter.  tines  ter 
database  config uraticr.  The  response  time  of  each  request  is 
recorded.  After  some  trial  runs,  we  compute  the  standard 
deviation,  fie  determine  that  ten  repetitions  cr  each  request 
is  sufficient  to  provide  the  desired  accuracy. 

The  internal  processing  times  of  the  aessa  g e-na r  dim g 
routines  which  are  used  to  process  a  retrieval  request  are 
also  timed.  Retrieval  (1)  and  Retrieval  (2)  are  selected  to 
conduct  internal  timing.  These  requests  are  selected  since 
they  retrieve  the  smallest  portion  of  tne  test  catalase  and 
the  processing  time  fer  each  request  is  minimal.  Recall  tr.at 
eacn  message— handling  routine  is  timed  independently  cf  all 
ethers  and  that  each  routine  must  process  multiple  requests 
so  that  an  accurate  average  may  be  computed  for  the  time 
required  tc  process  that  request  type.  Sixteen  message- 
handling  routines  are  required  to  process  a  retrieve 
request.  If  we  send  twenty  requests  to  eacn  routine,  a  total 
cf  32C  requests  must  be  processed  by  MD3S.  3asea  cr.  thes<_ 
fig-ures,  the  time  required  tc  conduct  tne  internal  perform¬ 
ance  measurement  of  a  retrieval  tnat  nas  a  response  time  cf 
twenty  seccnds  will  be  approximately  107  minutes.  This 


f  i  jure  aces  not  induce  the  administrative  tine  re ; uir  ed  to 
process  the  internal  ceasurement  data.  For  this  reason,  *e 
limited  the  internal  performance  measurement  re  guests  to 
Retrievals  (1)  and  (2). 

Additionally ,  we  also  limited  the  rusher  or  repetitions 
per  message  handler  to  twenty.  Inis  is  acne  to  reduce  the 
processing  time  per  message  handler.  However,  this  decisio.. 
reduces  the  accuracy  of  the  internal  performance  measure¬ 
ment,  from  te n— th ou sands  to  hundredths  of  a  second.  It. us, 
the  internal  performance  measurement  times  provide  cr.iy  a 
rough  estimate  of  the  time  reguirei  to  handle  the  respective 
messa  ces . 

Ihe  first  section  of  this  chapiter  contains  the  external 
timing  results  obtained  from  our  measurements.  Ke  also 
discuss  the  performance  and  capacity  improvements  obtained 
hy  adding  backends.  In  the  second  section  we  present  tee 
results  from  internal  performance  measurement.  The  iir.al 
section  examines  the  inter-process  and  inter-computer 
message  processing  times.  Cne  final  note,  ti.e  units  on 
measurement  presented  in  the  tanies  of  this  chapter  ate 
expressed  ir.  seconds. 

A.  I  EE  EXTERNAL  PEEECBMANCE  E  ESOLTS 

Table  VII  provides  the  results  of  cne  external  perform¬ 
ance  measurement  of  iv.D35  without  the  internal  pe  r  f  ;  r  ia  r.c_- 
measu  reaer.t  software.  Inert  are  tnree  parts  to  Tarie  VII. 
Each  part  con  tains  the  me an  and  the  standard  deviation  on 
tae  response  times  for  retrievals  (1)  taro  ugh  (9),  which  are 
outlined  in  Chapter  V.  Ihe  three  parts  of  Taixe  VII  rtp  re¬ 
sent  three  different  cor.fi  g  urations  of  the  MIES  hardware  and 
the  database  recor  1  carac ity .  Ihe  first  part  has  hfrl 
configured  with  one  tacxend  and  the  database  loaded  wit:: 
100  0  records.  The  secci.d  part  a  as  d  jbS  configure  i  wit:,  two 
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Ihe  Response  Time 

Without  Internal  Performance  Evaluation  Software 
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IK  Re  cor  is  |j  2.*.  Records  j 

|  |  mean 

staev j 

L 

mean 

stdev  j  1  sear,  j  strev| 
i  1  i  t 

j  1  j  3.  20  8] C.  Cl 89j 

2.  051 

0  •  3324  j 1  3. 3 5  2 i  0. 0  id i j 

j  2  jl3.€91 

C .  0255  j 

j 

7.511 

0.0333  i  j  Ib.^jjo.CISs! 

j  3  |  26.492 

0.0244 j 

1  4.  1  64 

0  .  J  2  o  9  1  j  26  .  7  3  7  |  C  .  C  4  3  5  ] 

j  4  J52.  005 

C.C5391 
_  | 

26 . 586 

0  .  3  2  9  a  j  j  52.  1  7  3  1  3  .  3  23  c  J 

j  5  1  21.449 

C.  0  336 

11.309 

3. 3375] i  21. 55 012.02371 

j  6  |  42.  235 

C .  C 326  1 

| 

2  1  . o22 

3*  34^4]  j  ,  Jc*7|  w  .  3  -♦  J  \j  | 

|__Z  _i_l2-  235J 

C. G403 j 

6.  642 

0 . 0  2  8  9  j  1  1  2  .  3  4  7  j  0  .  2  3  7  1  1 

J  8  J  22.  532 

0.02961 

, 

1  1.764 

0. 0  3  00  j  |  22. 5 f J  j C. 0  1 1  Oj 

i  9  1  2  3.913 

C .  1 1  1 5  J 

12.62410  . 3350]  |  24.  1  c  9  j  C  .  0  1  o  1 1 

backends,  with  the  database  containing  1000  records,  split 
evenly  between  the  backends.  The  third  fart  iras  .“•’DBS  cor; in¬ 
ured  with  two  backends,  with  the  database  courier  :c  1399 
records,  also  split  evenly  between  the  backenos.  In  Table 
VII  we  notice  one  data  anomaly,  the  standard  deviaticr  ror 
request  (S)  in  the  cne— backend— with— IjOO— records  cor.ri  dura¬ 
tion.  Since  we  die  not  conduct  an  internal  per  fori  ar.ee 
measurement  on  this  request,  we  are  not  sure  what  causes 
tnis  SKewed  standard  deviation,  and  nence  wall,  not  attempt 
to  offer  an  explanation  of  this  anomaly. 

Given  the  data  presented  in  Tac2.e  VII, 
attempt  tc  verify  cr  disprove  the  two  MDBS 
claims.  We  beyin  by  calculating  tne  response- 
meat  c£  (IDES.  Ihe  r  e  s  por.se-  t  lae  improvement  is 


we  car.  row 
per  r  or  iar.ee 
uit;  1 3  £  ro  ve- 


denr.cj  to  ce 


Figure  7. 1  The  Response-Tine— Impr ovemeot  Calculation. 


tne  percentage  impr c vemen t  in  the  response  time  cf  a 
reguest,  when  the  reguest  is  executed  in  n  backends  as 
opposed  to  one  backend  and  the  number  of  records  in  the 
database  remains  the  same.  Figure  7.1  provides  the  formula 
used  to  calculate  the  respcnse-tiae  improvement  for  a 
particular  request,  where  C cnf ig urati on  3  represents  n  back¬ 
ends  and  Configuration  A  represents  one  backend.  Thus,  in 
Table  VIII  we  present  the  response-time  improvement  rcr  the 
data  given  in  Table  VII.  Notice  that  the  response-time 
improvement  is  lowest  for  reguest  (1),  whicn  represents  a 
retrieval  of  two  records  of  the  database.  Cn  tae  other  hand, 
the  re spcnse-time  improvement  of  request  (4) ,  whicn 
retrieves  all  cf  the  database  information  is  highest, 
approaching  the  upper  bound  of  firty  percent.  In  general,  we 
find  that  the  re spcnse-ti me  improvement  increases  as  the 
number  of  records  retrieved  increases.  Inis  seems  to  support 
a  hypothesis  that  ever  if  the  database  grows,  the  response— 
time  improvement  will  remain  at  a  relatively  high  (between 
40  and  5C  percent)  level. 

Next  we  calculate  the  response-time  reduction  cf  ?!o3S. 


The  res  rcnse-time  reduction  is  defined 
tion  in  response  time  of  a  reguest. 


to  be  the  the  reiuc— 


reguest  is 


wnen  the 


TAEIE  VIII 

I he  Response-Time  Improvement  Between 
1  and  2  Backends  (External  Measurement  Only) 


E  eguest 
N  umber 

Response  Time 
Improvement. 

1 

36.07 

2 

45.14 

3 

4  6.53 

a 

48.94 

5  ! 

47.27 

6 

48.81 

7 

4  5.93 

8 

47.79 

9 

47.2  1 

IX  Records 
No  Internal— 
Measure ment  Software 


executed  in  n  backends  containing  nx  number  cf  records  as 
opposed  to  cne  backend  with  x  number  of  records.  Figure  7.2 
provides  the  formula  used  to  calculate  the  the  response-time 
reduction  for  a  particular  retrieval  request,  where  configu¬ 
ration  A  represents  one  backend  with  x  records  and  configu¬ 
ration  5  represents  n  backends,  each  with  x  records.  In  | 

Table  IX  we  present  the  r esp cnse— ti me  reductions  for  the 
data  given  in  Table  VII.  Notice  that  the  response-time 
reduction  is  worst  for  reguest  (1)  ,  which  represents  a 
retrieval  of  two  records  of  the  database.  Cn  the  ether  hand, 
the  response-time  reductions  for  the  retrievals  which  access 
larger  portions  of  the  database,  requests  (4)  and  (6),  have 
only  a  snail  response-time  reduction.  In  general,  we  found 
t.nat  the  response— time  reductic..  decreases  as  the  number  of 
records  retrieved  increases,  i.e.,  the  response  time  remains 
virtually  constant.  Again  we  seem  to  have  evidence  to 


support  the  hypothesis  that,  as  the  sine  of  the  database 


Table  X  provides  tne  results  of  external  perforrrar.ee 
meas  u  re  men  t  of  M03S  with  internal  performance  measurement 
software  rr  place.  There  are  two  parts  to  Table  X.  Each 
part  contains  the  mean  and  the  standard  deviation  of  the 
respense-tirres  for  the  requests  (1)  tnrough  (6),  which  are 
outlined  in  Chapter  V.  The  two  parts  of  Table  X  represent 
two  different  configurations  cf  the  MDBS  hardware  and  tne 
database  record  capacity.  Part  one  has  MDBS  configured  with 
one  backend  and  the  database  leaded  with  1000  records.  Part 
two  has  MDBS  configured  with  two  backends,  with  the  datacase 
containing  1000  records,  split  evenly  between  the  backends. 
Re  did  r.ct  measure  the  response  times  with  two  thousand 
records  distributed  ever  two  backends.  We  felt  that  r.c  addi¬ 
tional  information  wculd  be  gained  rsy  conducting  the  meas- 
urement  s . 


TABLE  X 

The  Response  Time 
With  Internal  Performance  l 

(in  seconds) 
leasurement  Software 

stdev 


1.205  0.0436 
11.418  0.0172 
25.903  0.0119 
5C.750  0.0374 
2C.972  0.0271 
4  1.262  0.033  1 


mean  I  stdev 


2.219 

0.  04  74 

7.  401 

0. 0277 

1 3.854 

0. 03b  1 

26.402 

C.  0596 

1  1.244 

0. 0528 

21.517 

0.0575 

An  interesting  accmaly  is  discovered  when  we  compare  the 
response  times  cf  the  external  and  internal  performance 
measurement  tests,  i.e.,  parts  one  and  two  of  Tables  VII  ana 


X  for  requests  (1)  through  (6)  .  We  actually  found  a  general 
impre veient ,  from  O.H  to  5X,  in  the  response  times  cx  the 
requests  when  the  internal  performance  measurement  software 
is  part  cf  the  MDBS  code.  Cne  hypothesis  is  that  this  is 
due  to  the  tanner  in  which  MDBS  is  implemented  on  the  tack- 
ends.  Currently,  there  is  net  sufficient  physical  memory 
available  on  each  backend.  The  result  is  that  disk  overlays 
are  used  to  swap  in  the  code  necessary  to  run  MDBS.  The 
additional  internal  performance  measurement  code  may  cause 
the  operating  system  to  overlay  differently,  thereby 
benefiting  the  overall  performance  of  MDBS.  Vie  still 
believe  that  there  is  an  overhead  induced  by  the  internal 
measurement  code  and  Table  XI  provides  evidence  by  demon¬ 
strating  that  the  response-time  improvement  achieved  by 
adding  a  backend  is  net  as  good  as  tnat  of  Table  VIII. 


TABU  XI 

The  fiesponse  Time  Improvement  Between 
1  and  2  Backends  (With  Internal  Measurement  Also) 


Reguest 

Number 

1 

Response  Time 
Improvement 

1 

30.76 

2 

44.34 

3 

46.52 

4 

47.98 

5 

46.39 

6  i 

47.35 

IK  Records 
Evaluation 
S  of tware 


E.  TEE  INTERNAL  PERiCEMANC  E  RESOLTS 

Table  XII  provides  the  results  or  the  internal  yeriorn- 
ance  treasure  me  nt  of  FTPS  for  a  retrieval  request.  The  tires 
measured  for  each  aessage-har  dimg  routine  are  given  for 
tot. h  regrest  (1)  and  (1).  The  message—  nan  dling  routines  are 
listed  •with  the  d  DE  S  process  wr.ich  contains  tne  routine. 
Although  the  results  are  givct  to  four  uecimai  places,  we 
only  trust  the  accuracy  to  the  second  decimal  place.  The 
reason  for  this  has  teen  discussed  in  tr.e  introduction  to 
this  chapter,  k'e  are  not  experts  on  the  ME3S  system.  X e  can, 
however,  make  a  few  comments  on  Table  XII  and  we  are  sure 
that  these  who  are  experts  can  use  tne  results  contained  in 
Table  XI I  tc  draw  mere  in-de^th  conclusions  on  tne  system. 
He  see  that  the  controller  processes,  i.e.,  Fecuest 
Preparation  and  Post  Processing,  spend  very  little  time  in 
processing  the  retrieval  request.  Tnis  is  a  major  design 
goal  cf  ME E S  and  is  necessary  to  prevent  a  bottleneck  at  the 
controller  when  the  number  of  backends  increases  substan¬ 
tially.  It  at pears  that  this  goal  is  aet  successfully.  He 
also  observe  that  the  results  obtained  from  Concurrency 
Control  are  consistent  and  cf  short  duration.  This  is 
expected  since  there  is  only  one  request  in  the  system  at  a 
time  and  no  access  contention  can  occur.  These  tables  should 
then  be  considered  as  containing  the  nest-case  times.  The 
majority  or  work  done  in  the  bac.eend  is  at  Record 
Processing.  Observing  the  process  timings  in  Reccrd 
Processing,  we  see  that,  for  both  requests,  the  addition  cf 
an  extra  tackenc  reduces  the  record  processing  time  by 
nearly  half. 

C.  TEE  MESSAGE  PROCESSING  RESOLTS 

Table  XIII  provides  some  average  times  relating  to 
ir.ter-pr ccess  message  passing  times  on  the  controller  and 
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TABLE  XII 

Message  Handling  Routine 
Processing  Times  for  a  Retrieval  Request 


MIES 

E i ccess 

- 

Message 

Handling 

Routine 

F.  equestj 
Hum her  | 

1 

One 

Backend 

IK  Records; 

Two 

Backends 
IK  Record 

Request 

Record  Count 

1 

0.0005 

0.C015 

E  ref ara tion 

To  Pest  Proc 

2 

.. 

0.0000 

-  .  . 

0. oooo 

Parse 

1 

r 

0.0200 

0.  C190 

Iraffic  Unit 

2  I 

1 

0.0180 

0.  0  165 

3r cadcast 

1 

0.0025 

0  .  C  0  2  5 

Request 

2 

” 

0 . 0  0  65 

0 . 0  0  3  C 

Ecst 

Collect 

1 

0  .  0  4  65 

0. 075 C 

E iocess ing 

R  esults 

2 

0 .0850 

o.oa  is 

Directory 

Parsed 

1 

" 

0 . 0 1>  95 

0.0450 

Eanageiieht 

Traffic  Unit 

L 

0.0525 

0. C49  1 

Did  Sets 

1 

0.0516 

0 .  C56  6 

Locked 

2 

0.0566 

0 . 05  6  6 

Cid  Sets 

1 

0.0533 

0. 0349 

Locked 

2 

0.0450 

0.0433 

Descriptor 

1 

na 

0.0391 

Ids 

2 

na 

0.055c 

Concurrency 

Cids  for 

1 

0 . 042^ 

0. 0433 

Ccntrcl 

Traffic  Unit 

2 

0.0425 

0. 0433 

Did  Sens 

1 

0.  0566 

0.040c 

Traffic  Unit 

2 

0.0503 

0.  C5  16 

Did  Sets 

1 

0.0025 

0. CO  16 

Released 

2 

0.0008 

C. GCC6 

Record 

Entire 

1 

2 . 6462 

1.3775 

E  rocess  ing 

Process 

2 

““ 

12.7100 

6. 5716 

Reques  t  wit h 

1 

0 . 0  4oo 

0  .  0  4  j  c 

Disk  Address 

2 

L. 

0.0433 

0.0362 

Cld 

1 

0.0130 

0.0146 

Re  quest 

2 

__ 

0. 0  131 

0.  0166 

E  1C 

1 

0.0544 

0.0665 

Read 

2 

L_ 

0.359 3 

0  .  6  8  c  3 

Eisk 

1 

r 

0 . 0799 

0.0741 

Input/Outpu  t 

2 

0.0783 

0.0725 
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TAE1Z  XIII 

Inter-prccess  Message  Passing  Times 


locat  ion 

Time  to 
Constr  uct 
Messa  ge 

Time  to 
Recei ve 

M  es  sage 

Tiae  to 
Pass 
Message 

C  ent  roller 

Eacke  nd 

0.00249 

0.00830 

0.00267 

0.  004  10 

0. 00520 

0. 0  1250 

_ i 

the  tacKend.  Messages  ace  transmitted  between  two  processes 
cn  tcth  the  controller  and  backend.  Both  the  number  of 
messages  and  the  message  lengta  are  varied.  Cn  the 
controller,  the  number  of  messages  is  varied  from  1  tc  100 
wnile  the  message  lergth  is  varied  from  2  to  2000  bytes 
(sire  of  the  message  ouffers  in  MOBS  controller).  Or.  the 
bacKerd,  the  number  cf  messages  is  varied  from  1  to  5  C  wnile 
the  message  length  is  varied  from  1  to  1000  bytes  (size  of 
the  message  buffer  in  MDBS  backends).  It  takes  the  backend 
twice  as  long  to  process  a  message  as  it  aces  the 
controller.  Vie  believe  the  reason  to  be  hardware  processor 
speed.  Ar.  independent  test  showed  that  this  re  la  tiensr.  ip, 
cf  twe  to  one,  holds  in  how  long  it  takes  to  process  an 


assignment  statement  cn  the  respective  Hardware. 


It  Taile  XIV  we  provide  information  concerning  the  time 
to  process  inter-computer  messages  on  tae  PCL.  Messages  of 
length  less  than  forty  are  overshadowed  by  the  overhead  of 
the  PCI.  There  exists  a  linear  relationship  between  the 
message  length  and  tie  time  tc  pass  a  message  as  the  message 
length  exceeds  1 OJ  bytes.  We  can  therefore  expect  a  linear 
perf crmance  from  the  PCI  for  the  majority  of  the  MDBS  inter¬ 
computer  messages.  The  next  chapter  will  contain  seme 
concluding  remarks  and  discuss  areas  for  further  research. 


TAELS  XIV 

Inter-computer  Message  Passing  Times 


Message 

Length 

(Bytes) 

Time  to 

Pass 

Message 

1 

Change  | 

1C 

0 .0S49 

0.0000 

1C 

0  .  0551 

0.0002  | 

30 

0 .0554 

0.  0003  | 

4  C 

0 .0S57 

0.0003  j 

5C 

0.  1005 

0.  0003  | 

ec 

0.1011 

0.  0006  | 

70 

0.10  18 

0.  0007  | 

8  C 

0.  1023 

0.  0005  j 

SC 

0 .  102S 

0.  0006  | 

ICG 

0 .  1056 

0.0007  1 

2C0 

0.1136 

0.0100 

3CC 

0 .  1238 

0.0102  | 

4  CC 

0 .  133S 

0.0  10  1  | 

5  CC 

0  .  1 4  5  S 

0.0100  1 

10GC 

0 . 1 S  43 

0*  J  a04  J 

1 

VII.  THE  CONCLUSION 

A.  A  SUMMARY  OF  THE  EEEEORMANCE  MEASUREMENT  METHODOLOGY 

1  •  Ihfc  In  tern al  fctroraaice  Met  nod  do  c  y 

lae  internal  perr  ciaar.ce  mens  ar  emer.t  methodology 
provides  the  strategies  and  locations  for  tne  places er.t  of 
cneckpcirts.  It  further  provides  the  kinds  of  performance 
data  tc  re  collected-  This  indorsation  enables  a  tetter 
understanding  of  the  target  system  hy  measuring  certain 
capabilities,  such  as  the  time  spent  in  individual 
processes.  Using  this  information  of  how  the  system 
performs  internally  may  lead  to  design  modir ica tions  cr  to 
fine-tuning  of  tr.e  system  for  increased  performance. 

2 •  The  Ex  tern  a 1  lerformance  Measurement  Methodclcu y 

The  external  performance  measurement-  set  hcdcl  egy 
provides  the  strategies  for  a  macro  view  of  tne  datarase 
system  performance  by  measuring  the  system  as  a  whole.  Re 
focus  cn  the  measurement  of  the  response  time  or  the  target 
system  after  tne  issuance  of  a  request.  A  test  database  ani 
a  test  reguest  set  is  generated  usin^  software  tools. 

2  •  Cc  dining  the  Internal  and  External  Measurement 

M  et  hod oleuies 

The  natural  combination  of  tne  internal  and  external 
performance  measurement  methodologies  is  synergistic  lr  the 
amount  cr  information  that  is  provided.  Tne  cverneai 
incurred  -her.  using  internal  performance  measurement  ccct  is 
accurately  cetermined  using  this  aetuoioiogy  corns  ina  ti  cr. . 
The  externa-  performance  measurement  timings  can  be  properly 
interpreted  using  the  internal  perrormance  measurement 
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t n e  vnclc  c; 


results.  Ey  combining  the  two  seas ureme r.ts , 
the  leasureient  results  is  lore  meaningful  and  usexui  than 
the  ircividual  results. 

E.  A  SUHMAEY  OF  THE  EETHODC1CGY  APPLICATION 

Thrilling  and  unexpected  results  are  collected  when  this 
methodology  is  applied  to  a  target  system,  i.e.,  MDBS. 
First,  the  methodology  proves  itself  to  he  successful  in 
attempting  to  verify  the  performance  and  capacity  claims  of 
MDBS.  This  results  from  being  able  to  collect  sufficient 
data  on  a  target  system  tc  maxe  definitive  statements 
concerning  its  performance.  The  application  of  this  method¬ 
ology  tc  DDES  is  also  surprisingly  easy. 

A  second  result,  is  that  tne  performance  and  capacity 
claims  of  MDBS  have  heen  validated.  These  claims  are:  1) 
that  hy  increasing  the  number  of  bacKends  used  as  a  part  cf 
the  catalase  system  ard  by  keeping  the  size  of  the  datarase 
constant,  the  response  time  of  the  same  transactions  is 
proportionally  decreased,  and  2}  that  ny  increasing  the 
number  cf  backends  ard  also  increasing  the  size  of  the  data¬ 
base,  the  response  time  remains  relatively  constant.  These 
claims  are  validated  by  the  results  given  in  Chapter  VI. 

These  spectacular  results  provide  a  wealth  of  informa¬ 
tion  from  which  several  conclusions  can  re  made.  Ke  rind 
that  under  MDBS,  the  response-time  improvement  increases  as 
the  number  of  records  retrieved  increases.  Also,  tne 
response- ti me  reduction  decreases  as  the  number  or  records 
retrieved  increases.  Though  tne  performance  measurement 
results  indicate  an  improvement  in  tne  response  time  or  the 
reguests  when  the  internal  performance  measurement  software 
is  part  cf  MDBS  code,  it  is  felt  tnat  tais  phenomenon  is  the 
result  of  differing  system  overlays  and  that  the  induced 
overhead  cf  internal  measurement  code  still  needs  tc  be 
calc  ula  ted . 
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The  results  or  the  internal  performance 
indicate  that  the  controller  processes,  i.e  .  ,  Be  guest 
Preparation  and  Post  Processing,  spend  very  little  tire  to 
process  the  retrieval  reguest.  Tne  results  obtained  iron 
Concurrency  Control  are  totn  consistent  and  of  short  dura¬ 
tion,  as  expected.  The  results  also  snow  that  the  majority 
ci  wcrk  is  being  dene  in  Record  Processing  and  that  the 
addition  of  a  backend  reduces  the  record  processing  tire  by 
nearly  half.  We  disco vered  that  it  takes  the  lacker. d  twice 
as  long  tc  process  a  message  as  it  does  the  ccn tre  i^er , 
possibly  due  to  hardware  processor  speed.  Finally,  there 
exists  a  linear  relationship  retween  the  message  length  and 
the  time  tc  pass  a  message  as  the  message  lengtn  ''xceeis  10  3 
bytes. 

C.  EECCMMENDATIOHS  ECE  FUTURE  EFFORTS 

Future  improvements  can  be  made  in  the  performance  meas¬ 
urement  methodology  by  tne  automation  or  the  existing 
external  software  tocis.  S pecii icaiiy ,  tne  anility  tc  start 
a  test  which  will  execute  a  pre-deter mined  set  of  reguests  a 
pre-deteimr.ed  number  of  times  for  each  reguest,  and  collect 
the  results  in  a  file  is  a  desireacle  feature. 
Additionally,  since  the  methodology  is  intended  tc  be 
general  in  use,  the  methodology  needs  to  be  applied  to 
different  database  systems  tc  discover  its  applicability, 
ease  cf  use,  and  usefulness  in  overall  performance  measure¬ 
ment  cf  the  target  system. 

Ir.  terms  of  the  application  of  this  metnodoiogy  to  h.DRS, 
a  complete  and  thorough  test  or  tne  system  needs  tc  be 
conducted.  An  exhaustive  test  or  M15S  would  include 
conducting  test  with  databases  that  nave  varyin  'j  rtccri 
sizes.  Further,  testing  the  system  rg  varying  the  number  or 
eirectory  attributes,  descriptors,  and  u.usttro  would  ir. ci- 
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cate  the  rcie  of  t  he  directory  data  in  tne  system.  Insert, 
delete,  ar.d  update  requests  oust  also  t e  Measure:  to 
discover  their  impact  on  system.  -.errurxance.  lastly,  1 1  e 
ffeasureaent  sh o u Id  be  extended  to  test  ,M.D5S  ween  it  use.=  tr.-_ 
secor.  ca  r  v— memo  r  v-bas  e  c  directorv  aanaicxent  ’.’recess. 
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